The 15th International Conference on Principles
and Practice of Constraint Programming
Doctoral Program Proceedings
September 20 - 24 2009
ii
Welcome to the proceedings of the 2009 Constraint Programming Doctoral
Program, held in conjunction with the 15th annual Constraint Programming
conference in Lisbon, Portugal. The doctoral program is open to PhD students
in all areas related to constraint programming. All participants present work
either within the doctoral program or in the main Constraint Programming
Conference.
The papers in this proceedings are those which have been submitted directly
to the doctoral program. They contain a wide variety of work, either completed
or in progress, being undertaken by the current generation of PhD students. We
also list all students who have papers accepted into the main conference.
The Doctoral Program would not be possible without the support of many
people and organisations. In particular, we would like to thank the program
committee members and the sponsors of the Constraint Programming conference.
We hope you have an enjoyable time in Portugal and a productive conference.
Karen Petrie and Olivia Smith
Doctoral Program Chairs, 2009
The Program Committe
Peter Nightingale
Chris Jefferson
Neil Yorke-Smith
Zeynep Kiziltan
Sebastian Brand
Hubie Chen
Claude-Guy Quimper
Standa Zivny
Justin Pearson
Roland Yap
consists of:
University of St Andrews
University of St Andrews
SRI International
University of Bologna
University of Melbourne
Universitat Pompeu Fabra
University of Waterloo
University of Oxford
Uppsala University
National University of Singapore
iii
Students with papers accepted into the main conference:
Carleton Coffrin
Alberto Delgado
Aurelie Favier
Mohammad Fazel-Zarandi
Andy Grayland
Serdar Kadioglu
Ronan Le Bras
Nina Naroditskaya
Aurlien Rizk
Elaine Sonderegger
David Stynes
Kevin Tierney
Mohamed Wahbi
Justin Yip
Alessandro Zanarini
Standa Zivny
Brown University
IT University of Copenhagen
INRA, Centre de recherches de Toulouse
University of Toronto
University of St Andrews, Scotland
Brown University
Centre de recherche sur les transports
University of New South Wales and NICTA, Sydney, Australia
Inira
University of Connecticut
Cork Constraint Computation Centre, University College Cork
Brown University
LIRMM/CNRS, U. of Montpellier 2, France
and LIMIARF/FSR, U. of Mohammed V Agdal, Morroco
Brown University
Ecole Polytechnique de Montreal
University of Oxford
iv
Table of Contents
A Filtering Technique for Non-normalized CSPs
Marlene Arangu, Miguel A. Salido and Federico Barber
Continuous Search in Constraint Programming: An Initial Investigation
Alejandro Arbelaez and Youssef Hamadi
Constraint Based Languages for Biological Reactions
Marco Bottalico and Stefano Bistarelli
Capturing fair computations on Concurrent Constraint Languages
Paola Campli and Stefano Bistarelli
Finding Stable Solutions in Constraint Satisfaction Problems
Laura Climent, Miguel A. Salido and Federico Barber
Preliminary studies of BnB-ADOPT related with Soft Arc Consistency
Patricia Gutierrez and Pedro Meseguer
An automaton Constraint for Local Search
Jun He, Pierre Flener and Justin Pearson
Research Overview: Improved Boolean Satisfiability Techniques
for Haplotype Inference
Eric Hsu and Sheila McIlraith
Foundations of Symmetry Breaking Revisited
Tim Januschowski, Barbara Smith and Marc van Dongen
Energetic Edge-Finder For Cumulative Resource
Roger Kameugne and Laure Pauline Fotso
Dominion – A constraint solver generator
Lars Kotthoff, Ian Miguel and Ian Gent
On learning CSP specifications
Matthieu Lopez and Arnaud Lallouet
Propagating equalities and disequalities
Neil Charles Armour Moore, Ian Gent and Ian Miguel
Tractable Benchmarks
Justyna Petke and Peter Jeavons
A simple efficient exact algorithm based on independent set for
Maxclique problem
Zhe Quan and Chu Min Li
Exploring Local Acyclicity within Russian Doll Search
Margarita Razgon and Gregory M. Provan
Optimal Solutions for Conversational Recommender Systems
Based on Comparative Preferences and Linear Inequalities
Walid Trabelsi, Nic Wilson and Derek Bridge
A multithreaded solving algorithm for QCSP+
Jeremie Vautard and Arnaud Lallouet
1
7
13
19
25
31
37
43
48
54
64
70
76
82
88
94
100
106
1
A Filtering Technique for Non-normalized CSPs
Marlene Arangú (Student)
Miguel A. Salido and Federico Barber (Supervisors)
Instituto de Automática e Informática Industrial
Universidad Politécnica de Valencia.
Valencia, Spain
Abstract. In this work we present a new algorithm, called 2-C3OP, that achieves
2-consistency in binary and non-normalized CSPs. This algorithm is a reformulation of 2-C3 algorithm and it performs the constraint checks bidirectionally using
inference. The evaluation section shows that 2-C3OP achieve 2-consistency like
2-C3 and it is 40% faster and therefore it is able to prune more search space than
AC3. Furthermore, 2-C3OP performs fewer constraint checks than both AC3 and
2-C3.
1 Introduction
Proposing efficient algorithms for enforcing arc-consistency (which involves two variables) has always been considered to be a central question in the constraint reasoning
community. Thus, there are many arc-consistency algorithms such as AC1, AC2, and
AC3 [13]; AC4 [14]; AC5 [15, 12]; AC6 [5]; AC7 [6]; AC8[8]; AC2001, AC3.1 [7];
and more. However, AC3 and AC4 are the most often used [3]. Algorithms that perform arc-consistency have focused their improvements on time-complexity and spacecomplexity. Main improvements have been achieved by: changing the way of propagation, appending new structures; performing bidirectional search, changing the support
search, improving the propagation, etc. However, little works has been done for developing algorithms to achieve 2-consistency in binary CSPs (binary: all constraints
involve only two variables).
The concept of consistency was generalized to k-consistency by [11] and an optimal
k-consistency algorithm for labeling problem was proposed by [9]. Nevertheless, it is
only based in normalized CSPs. If the constraints have two variables in k-consistency
(k=2) then we talk about 2-consistency. Also, much work on arc-consistency made the
simplifying assumptions that CSPs are binary and normalized (two different constraints
do not involve exactly the same variables), because these notations are much simpler
and new concepts are easier to present. But [16] shows a strange effect of associating
arc-consistency with binary normalized CSPs. It is the confusion between the notions
of arc-consistency and 2-consistency (2-consistency guarantees that any instantiation
of a value to a variable can be consistently extended to any second variable). On binary CSPs, 2-consistency is at least as strong as arc-consistency. When the CSP is
binary and normalized, arc-consistency and 2-consistency perform same pruning. A
non-normalized CSP can be transformed into normalized by using the intersection of
valid tuples [2]. It can be a very hard task in problems with large domains.
2
We will focus our attention on binary and non-normalized CSPs. Figure 1 left shows
a binary CSP with two variables X1 and X2 , D1 = D2 = {1, 2, 3} and two constraints
′
R12 : X1 ≤ X2 , R12
: X1 6= X2 presented in [16]. It can be observed that this
CSP is arc-consistent due to the fact that every value of every variable has a support
′
. In this case, arc-consistency does not prune any value
for constraints R12 and R12
of the domain of variables X1 and X2 . However, (as authors say in [16]) this CSP
is not 2-consistent because the instantiation X1 = 3 can not be extended to X2 and
the instantiation X2 = 1 can not be extended to X1 . Thus, Figure 1 right presents
the resulting CSP filtered by arc-consistency and 2-consistency. It can be observed that
2-consistency is at least as strong as arc-consistency.
Fig. 1. Example of Binary CSP.
By following standard notations and definitions in the literature [4]; [3], [10]; we
have summarized the basic definitions that we will use in this rest of the paper.
Constraint Satisfaction Problem (CSP) is a triple P = hX, D, Ri where, X is
a finite set of variables {X1 , X2 , ..., Xn }. D is a set of domains D = D1 , D2 , ..., Dn
such that for each variable Xi ∈ X there is a finite set of values that variable Xi can
take. R is a finite set of constraints R = {R1 , R2 , ..., Rm } which restrict the values
that the variables can simultaneously take. We denote Rij ≡ (Rij , 1) ∨ (Rij , 3)1 as the
direct constraint defined over the variables Xi and Xj and Rji ≡ (Rij , 2) as the same
constraint in the inverse direction over the variables Xi and Xj (inverse constraint). A
block of constraints Cij is a set of binary constraints that involve the same variables Xi
and Xj .
Instantiation is a pair (Xi = a), that represents an assignment of the value a to the
variable Xi , and a is in Di . A constraint Rij is satisfied if the instantiation of Xi = a
and Xj = b holds in Rij . Constraint symmetry. If the value b ∈ Dj supports a value
a ∈ Di , then a supports b as well.
Arc-consistency: A value a ∈ Di is arc-consistent relative to Xj , iff there exists a
value b ∈ Dj such that (Xi , a) and (Xj , b) satisfies the constraint Rij ((Xi = a, Xj =
b) ∈ Rij ). A variable Xi is arc-consistent relative to Xj iff all values in Di are arcconsistent. A CSP is arc-consistent iff all the variables are arc-consistent, e.g., all the
1
More information regarding direct constraint is presented in the next section.
3
constraints Rij and Rji are arc-consistent. (Note: here we are talking about full arcconsistency).
2-consistency: A value a ∈ Di is 2-consistent relative to Xj , iff there exists a
k
(∀k : (Xi =
value b ∈ Dj such that (Xi , a) and (Xj , b) satisfies all the constraints Rij
k
a, Xj = b) ∈ Rij ). A variable Xi is 2-consistent relative to Xj iff all values in Di
are 2-consistent. A CSP is 2-consistent iff all the variables are 2-consistent, e.g., any
instantiation of a value to a variable can be consistently extended to any second variable.
2 Algorithm 2-C3OP
We present a new coarse grained algorithm called 2-C3OP that achieves 2-consistency
in binary and non-normalized CSPs (see Algorithm 1). This algorithm deals with block
of constraints as 2-C3 [1] but only requires keeps half block of constraints in Q and call
two procedures: Revise and AddQ. 2-C3OP performs bidirectional checks (as AC7) and
performs inference to avoid unnecessary checks. However, it is done by using structures
that are shared by all the constraints:
– suppInv: it is a vector whose size is the maximum size of all domains (maxD)
and it stores the value Xi that supports the value of Xj .
– minSupp: it is a integer variable that stores the first value b ∈ Dj that supports
any a ∈ Di . By using minSupport inference is carried out to avoid unnecessary
checks because all b ∈ Dj < minsupport are pruned without any check once
suppInv is evaluated.
– t: is an integer parameter an it takes values from {1, 2, 3}. This value is used to
guide to the Revise procedure (direct or inverse order, bidirectional search - see
below).
Initially 2-C3OP procedure stores in queue Q the constraint blocks (Cij , t) : t = 1.
Then, a simple loop is performed to select and revise the block of constraints stored in
Q, until no change occurs (Q is empty), or the domain of a variable remains empty. The
first case ensures that every value of every domain is 2-consistent and the second case
returns that the problem is not consistent.
The AddQ procedure would add tuples to be evaluated again. The added tuples will
depend on the tuples stored in Q and the variable change. Depending on the value of
t, generated by AddQ, the Revise procedure will guide the search. If t = 1 the search
is full bidirectional (directed and inverse); if t = 2 the search is inverse and the search
is also directed if and only if some pruning is carried out; finally if t = 3 the search is
directed and the search is also inverse if and only if some pruning is carried out.
The Revise procedure (see Algorithm 2) requires two internal variables changei
and changej . They are initialized to zero and are used to remember which domains
were pruning. For instance, if Di was pruned then changei = 1 and if Dj was pruned
then changej = 2. However, if both Di and Dj were pruned then change = 3 (because change = changei + changej ). During the loop of steps 3-9, each value in Di
is checked 2 . If the value b ∈ Dj supports the value a ∈ Di then suppInv[b] = a,
2
if t=2 the inverse operator is used
4
Algorithm 1: Procedure 2-C3OP
Input A CSP, P = hX, D, Ri Result: true and P ′ (which is 2-consistent) or false and P ′ (which is
2-inconsistent because some domain remains empty)
begin
1
for every i, j do
2
Cij = ∅
3
4
for every arc Rij ∈ R do
Cij ← Cij ∪ Rij
5
6
for every set Cij do
Q ← Q ∪ {(Cij , t) : t = 1}
7
8
for each d ∈ Dmax do
suppInv[d] = 0
while Q 6= ∅ do
select and delete (Cij , t) from queue Q with t = {1, 2, 3}
change = Revise((Cij , t))
if change > 0 then
if change ≥ 1 ∧ change ≤ 3 then
Q ← Q ∪ AddQ(change, (Cij , t))
else
return false /*empty domain*/
9
10
11
12
13
14
15
16
17
return true
end
Algorithm 2: Procedure Revise
1
2
3
4
5
6
Input A CSP P ′ defined by two variables X = (Xi , Xj ), domains Di and Dj , and tuple (Cij , t) and vector
suppInv. Result: Di , such that Xi is 2-consistent relative Xj and Dj , such that Xj is 2-consistent
relative Xi and integer variable change
begin
changei = 0; changej = 0
minSupp =dummyvalue
for each a ∈ Di do
if ∄b ∈ Dj such that (Xi = a, Xj = b) ∈ (Cij , t) then
remove a from Di
changei = 1
7
8
9
else
suppInv[b] = a
if minSupp =dummyvalue then minSupp = b
if ([(t = 2 ∨ t = 3) ∧ changei = 1] ∨ t = 1) then
for each b ∈ Dj do
if b < minSupp then
remove b from Dj
changej = 2
10
11
12
13
14
15
16
17
else
if suppInv[b] > 0 then
suppInv[b] = 0
18
19
20
21
else
22
23
change = changei + changej
return change
if ∄a ∈ Di such that (Xi = a, Xj = b) ∈ (Cij , t) then
remove b from Dj
changej = 2
end
5
due to the fact that it is based on symmetry of the constraint(the support is bidirectional). Furthermore the first value b ∈ Dj (that supports a value in Di ) is stored in
minSupport.
The second part of Algorithm 2 depends on the values t and changei . If t = 2 or
t = 3, and changei = 0 then Cij is not needed to be checked due to the fact that the
constraint, in the previous loop, has not generated any prune. However, if t = 1 then
Cij requires full bidirectional evaluation. If t = 2 or t = 3, and changei = 1 then
Cij also requires full bidirectional evaluation. In both cases, the processing of suppInv
can be done in three different ways: 1) the values of b ∈ Dj : b < minSupport are
pruned without performing any checking. Furthermore, variable changej is updated to
changej = 2 to indicate that a change in the second loop occurs. 2) the values of b
with suppInv[b] > 0 will not be checked due to the fact that they are supported in
Di . Thus, they are initialized to 0 for further use of this vector. 3) the values of b with
b > minorSupport and suppInv[b] = 0 will be checked until a support a is founded
or they will be removed in other case. In the last case changej is set to 2 in order to
indicate that a change in the second loop occurs.
3 Experimental Results
The experiments were performed on random instances characterized by the 4-tuple
< n, d, m, c >, where n was the number of variables, d the domain size, m the total
number of binary constraints and c the number of constraints in each block. Constraints
were implicity encoded. We evaluated 50 test cases for each type of the problem. Performance was measured in terms of number of values pruned, constraint checks and
computation time. All algorithms were implemented in C on a PC with 1 GB RAM and
3.0 GHz Pentium IV processor.
Table 1. Number of prunings and constraint checks by using AC3, 2-C3 and 2-C3OP in problems
100% non-normalized < n, 20, 800, 2 >.
Arc-consistency
2-consistency
value
AC3
2-C3
2-C3OP
n prunings time checks prunings time checks time checks
50
50
9 109001 140
20 66774 10 58778
70
70
9 109401 140
20 64816 13 56816
90
90
11 109801 140
20 63296 13 55296
110
110
14 110201 150
20 62801 13 54801
130
130
11 110601 170
23 63001 13 55001
150
150
12 111001 180
24 62616 16 54616
In Table 1 the results show that the number of prunings was lower (53%) in AC3
than in both 2-C3 y 2-C3OP in all cases. This is due to the fact that the problems start
with the same domain size and they have constraints with operators (=, 6=, ≤, or ≥), so
that AC3 does not pruned any value by analyzing the constraints individually. However,
6
2-C3 and 2-C3OP analyzed the blocks of constraints which have a mix of these operators and so is able to prune more search space. Furthermore, 2-C3OP performed fewer
constraints checks than both AC3 and 2-C3 by 50 % and 13 % respectively. Finally
2-C3OP was 40% faster than 2-C3.
4 Conclusions and Further Work
Filtering techniques are widely used to prune the search space of CSPs. In this paper, we
have presented 2-C3OP algorithm, a reformulation of 2-C3 algorithm, which achieves
2-consistency in binary and non-normalized CSPs using bidirectionally and inference.
The evaluation section shows that 2-C3OP achieves 2-consistency and thus is able to
prune more search space than AC3. Furthermore 2-C3OP performs fewer constraint
checks than both AC3 and 2-C3 and its computation time is near to AC3 and better than
2-C3 by 40%. In further work, we will focus our attention on improved versions of 2C3OP to avoid redundant checks by using both the Last value support (as in AC3-2001
[7]). Furthermore, our goal is to apply these filtering techniques to distributed CSPs.
References
1. Marlene Arangú, Miguel Salido, and Federico Barber. 2-c3: From arc-consistency to 2consistency. In SARA 2009: The Eighth Symposium on Abstraction, Reformulation and Approximation. To appear, 2009.
2. Marlene Arangú, Miguel Salido, and Federico Barber. Extending arc-consistency algorithms
for non-normalized csps. In AI2009: Twenty-ninth SGAI International Conference on Artificial Intelligence. To appear, 2009.
3. Roman Barták. Theory and practice of constraint propagation. In J. Figwer, editor, Proceedings of the 3rd Workshop on Constraint Programming in Decision and Control, 2001.
4. Christian Bessiere. Constraint propagation. Technical report, CNRS/University of Montpellier, 2006.
5. Christian Bessiere and Marie Cordier. Arc-consistency and arc-consistency again. In Proc.
of the AAAI’93, pages 108–113, Washington, USA, July 1993.
6. Christian Bessiere, E.C. Freuder, and J. C. Régin. Using constraint metaknowledge to reduce
arc consistency computation. Artificial Intelligence, 107:125–148, 1999.
7. Christian Bessiere, J. C. Régin, Roland Yap, and Yuanling Zhang. An optimal coarse-grained
arc-consistency algorithm. Artificial Intelligence, 165:165–185, 2005.
8. Assef Chmeiss and Philippe Jegou. Efficient path-consistency propagation. International
Journal on Artificial Intelligence Tools, 7:121–142, 1998.
9. Martin Cooper. An optimal k-consistency algorithm. Artificial Intelligence, 41:89–95, 1994.
10. R. Dechter. Constraint Processing. Morgan Kaufmann, 2003.
11. Eugene Freuder. Synthesizing constraint expressions. Commun ACM, 21:958–966, 1978.
12. P. Van Hentenryck, Y. Deville, and C. M. Teng. A generic arc-consistency algorithm and its
specializations. Artificial Intelligence, 57:291–321, 1992.
13. A. K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8:99–118,
1977.
14. R. Mohr and T.C. Henderson. Arc and path consistency revised. Artificial Intelligence,
28:225–233, 1986.
15. M. Perlin. Arc consistency for factorable relations. Artificial Intelligence, 53:329–342, 1992.
16. F. Rossi, P. Van Beek, and T. Walsh. Handbook of constraint programming. Elsevier, 2006.
7
Continuous Search in Constraint Programming:
An Initial Investigation
Alejandro Arbelaez1 (student)
Youssef Hamadi1,2
1
2
Microsoft-INRIA joint-lab, Orsay France,
alejandro.arbelaez@inria.fr
Microsoft Research, Cambridge United Kingdom,
youssefh@microsoft.com
Abstract. In this work, we present the concept of Continuous Search,
the objective of which is to allow any user to eventually get top performance from their constraint solver. Unlike previous approaches (see [9]
for a recent survey), Continuous Search does not require the disposal of
a large set of representative instances to properly train and learn parameters. It only assumes that once the solver runs in a real situation
(often called production mode), instances will come over time, and allow
for proper offline continuous training. The objective is therefore not to
instantly provide good parameters for top performance, but to take advantage of the real situation to train in the background and improve the
performances of the system in an incremental manner.
1
Introduction
In Constraint Programming, properly crafting a constraint model which captures all the constraints of a particular problem is often not enough to ensure
acceptable runtime performance. One way to improve efficiency is to use well
known tricks like redundant and channeling constraints or to be aware that the
constraint solver has a particular global constraint that can do part of the job
more efficiently. The problem with these improvements (or tricks) is that they
are far from obvious. Indeed, they do not change the solution space of the original model, and for a normal user (with a classical mathematical background),
it is difficult to understand why adding redundancy helps.
Because of that, normal users are often left with the tedious task of tuning
the search parameters of their constraint solver, and this again, is both time
consuming and not necessarily straightforward. Indeed, even if tuning is conceptually simple (try different parameters, pick the best), it requires a set of
representative instances in order to properly work. This might be obvious for a
constraint programmer, but not for a normal user which could train on instances
far different from the ones faced by his application.
In this work, we present the concept of Continuous Search (CS), the objective of which is to allow any user to eventually get top performance from
their constraint solver. Unlike previous approaches (see [9] for a recent survey),
Continuous Search does not require the disposal of a large set of representative
8
instances to properly train and learn parameters. It only assumes that once the
solver runs in a real situation (often called production mode), instances will
come over time, and allow for proper offline continuous training. The objective
is therefore not to instantly provide good parameters for top performance, but
to take advantage of the ’real’ situation to train in the background and improve
the performances of the system in an incremental manner.
The Continuous Search paradigm, uses an online learning algorithm to update a prediction function which matches instances features to the most efficient
set of parameters for a given instance (e.g. Variable/Value selection algorithms).
Since CS can start without offline training, this prediction function might be
initially undefined. If this is the case, an instance is solved by running the solver
with its default parameters. Once the instance is solved and the solution is given
back to the application1 , we start our Continuous Search training. At that point,
the instance is reused and the goal is to refine the strategy used to tackle it. This
is done by a specific repair-like algorithm which perturbs the strategy to find
a new strategy able to solve the problem more efficiently. If such a strategy is
found, its parameters are stored with the instance features, and therefore, could
be reused to solve similar instances more efficiently.
Technically, this means that there are two different search efforts. The one
done to solve the real problem, and the one related to the long term improvement of the constraint solver. We believe that this extra use of computational
resources is realistic, since nowadays systems (especially production ones) are
almost always on. Moreover, this has to be balanced against the huge computational cost of offline training [10]. Last, this late adaptation is the only way to
face the ’real’ instances and even, to adapt to changes on the modeling or to the
arrival of a completely new class of problem.
The paper is organized as follows. Background material is presented in Section 2. Section 3 introduces the continuous search paradigm. Section 4 presents
experimental results. Finally, before our general conclusion, Section 5 presents
related work.
2
Background
In this section, we briefly introduce definitions and notations used hereafter.
2.1
Constraint Satisfaction Problems
Definition 1 A Constraint Satisfaction Problem (CSP) is a triple (X, D, C)
where,
– X = {X1 , X2 , . . . , Xn } represents a set of n variables.
– D = {D1 , D2 , . . . , Dn } represents the set of associated domains, i.e., possible
values for the variables.
– C = {C1 , C2 , . . . , Cm } represents a finite set of constraints.
1
Since our standpoint is real settings, we consider the full application stack where
solvers are not isolated pieces of software called from a command line, but are critically embedded in large and complex applications.
9
Each constraint Ci is associated to a set of variables vars(Ci ), and is used to
restrict the combinations of values between these variables. Similarly, the degree
deg(Xi ) of a variable is the number of constraints associated to Xi and dom(Xi )
corresponds to the current domain of Xi .
Solving a CSP involves finding a solution, i.e., an assignment of values to
variables such as all constraints are satisfied. If a solution exists the problem is
stated as satisfiable and unsatisfiable otherwise.
In this paper, we consider four well known variable selection heuristics. mindom selects the variable with the smallest domain [4], wdeg [2] selects the variable
which is involved in more failed constraints, dom/wdeg [2] which selects the
dom
and impacts [7] whose objective is to
variable which minimizes the ratio wdeg
select the variable-value pair that maximizes the reduction of the remaining
search space.
2.2
Support Vector Machines
Among the prominent Machine Learning (ML) algorithms are Support Vector
Machines (SVM) [3]. This algorithm is highly used in binary classification due
to its statistical learning properties, it determines the separating hyperplane
with maximum distance or margin to the closest examples (so-called support
vectors) of the training set.
Learning a high quality model depends on the quality of the training set. On
the one hand, the description of the examples must enable to discriminate among
positive and negative examples; on the other hand, the available examples must
enable to accurately localize the frontier between the two classes.
3
Continuous Search in Constraint Programming
The goal of this paper is to take advantage of real world situations as shown in
Figure 1, where instances are presented one at a time. Therefore, in Continuous
Search settings we consider two different phases, exploitation (or solving time)
and exploration (or learning time). The former tries to solve new instances using
the acquired knowledge, and the latter is focused on improving the classification
accuracy by means of re-using previously seen instances.
Instances
Solving time
Learning time
I0
I1
...
Ik
Fig. 1. Continuous Search scenario
3.1
Online Learning
To deal with the continuous search scenario, we follow the same approach as in
[1], where the authors propose to use a supervised Machine Learning algorithm to
select the most appropriate heuristic at different states (so-called checkpoints) of
the search tree. To this end, we characterize CSP instances by means of features
10
(i.e., general information common to all CSPs). Those features are the input of
an Online SVM algorithm which selects the best heuristic within the checkpoint
window.
The features set is divided into two main categories, static and dynamic.
The former intends to distinguish instances from each other (e.g., number of
variables, constraints, etc.), while the latter is used to monitor the progress of
the search process (e.g., max. number of failures, variable’s weight, etc.). For
more details about these features see [1].
Our choice of an Online SVM algorithm is motivated by the fact that it
does not need to re-train the classifier once a new example arrives. Note that
training a classical SVM (so-called batch learning) requires the solution of an
optimization problem which is not an ideal situation in Continuous Search.
4
Experiments
This section describes the experimental validation of the proposed approach.
In these experiments we included a collection of 100 nurse-scheduling problems
from the MiniZinc3 repository.
4.1 Experimental setting
The learning algorithm used in the experimental validation of the proposed approach is a Support Vector Machine with Gaussian kernel; we used the libSVM
implementation. All our CSP heuristics (see Section 2.1) are home-made implementations integrated in the Gecode-2.1.1 constraint solver.
ID Variable sel Value Sel ID Variable sel Value sel
1 dom/wdeg
min
5 dom/deg
min
2 dom/wdeg
max
6 dom/deg
max
3
wdeg
min
7 min-dom
min
4
wdeg
max
8
impacts
—
Table 1. Candidate heuristics; the default heuristic is the first one.
Two CSP adaptive strategies have been experimented, respectively considering the first 4 and 8 strategies in Table 1. In all cases, the default heuristic is
the first one: dom/wdeg for variable selection and min-value for value selection.
The exploration examples are generated by adding minor perturbations to
the default execution of heuristics (i.e., executing the default heuristic at each
checkpoint). Thus, during the learning time, we ran the last seen instance replacing the default heuristic by another candidate in exactly one checkpoint,
this process is repeated for each heuristic in Table 1 for a limited number of
checkpoints (10 in this paper). Currently, we are working on a more informative way of selecting the exploration points considering the distance of unlabeled
examples to the decision boundary.
All experiments were performed on a 8-machine cluster running Linux Mandriva 2009, all machines have 64 bits and two quad-core 2.33 Ghz with 8 Gb of
RAM. A time out of 10 minutes was used for each experiment.
3
Available at http://www.g12.cs.mu.oz.au/minizinc/download.html
11
It can be observed in Figure 2 that the dynamic approach is able to solve
more instances that its competitors (i.e., the default strategy and a random
heuristic selection). However the performance goes down as the number of candidate heuristics increases. The main explanation for this phenomenon relies on
the fact that we are not using any sophisticated strategy for breaking ties (i.e., if
several heuristics are predicted to outperform the default one we pick one at random). We are currently studying different approaches to breaking ties, selecting
the best algorithm using the decision value, exploiting the fact that examples
that are far from the classification boundaries are more likely to be correctly
predicted.
flatzinc nsp-14 (4 heuristics)
600
500
500
def
dyn
random
400
run-time (s)
run-time (s)
flatzinc nsp-14 (8 heuristics)
600
300
200
100
def
dyn
random
400
300
200
100
0
0
0
10
20
30 40 50 60
solved instances
70
80
0
10
20
30
40
50
solved instances
60
70
Fig. 2. Nurse Scheduling-14 (nsp-14), Note that this data shows the performance of
the continuous search approach with a particular ordering of the problem instances
5
Related work
In this section, we describe some related work that has been proposed to integrate
Machine Learning Algorithms into CSP and related areas such as: SAT and
Quantified Boolean Formulas (QBF).
SATzilla [10] is a well known SAT portfolio solver which is built upon a set of
features, in general words SATzilla includes two kinds of features: basic features
such as number of variables, number of propagators, etc. and local search features
which actually probe the search space in order to estimate the difficulty of each
problem-instance. The goal of SATzilla is to learn a runtime predictor using a
simple linear regression model.
CPHydra [6], one of the best constraint solvers in the lastest CSP competition4 is a portfolio approach based on case-based reasoning. Broadly speaking
CPhydra maintains a database with all solved instances (so-called cases). Later
on, once a new instance arrives a set of similar cases C is computed and the
heuristic that is able to solve the majority of instances in C is selected. The
main drawback of this portfolio approach is that due to its high complexity to
4
http://www.cril.univ-artois.fr/CPAI08/
12
select the best solver, it is limited to a small number of solvers (in competition
settings less than 6 solvers were used).
Our work is related to [8] in a way that they also apply machine learning
techniques to perform on-line combination of heuristics into search tree procedures. Their paper proposes to use a multinomial logistic regression method in
order to maximize the probability of predicting the right heuristic at different
states of the search procedure. Unfortunately, this work requires an important
number of training instances to get enough generalization of the target distribution of problems and does not fulfill all the requirements of the Continuous
Search settings.
6
Conclusion
This paper has presented an introduction to Continuous Search, this new concept
includes an online algorithm to adaptively tune a constraint solver. At different
states of the search, the instance feature is provided with dynamic information
collected by the search engine to dynamically adapt the search strategy of a
well known CP solver in order to more efficiently solve the current instance. The
results in this paper show that the online approximation to deal with continuous
search outperforms a very good default heuristic for solving CSPs.
7
Acknowledgements
We would like to thank the anonymous reviews and Michele Sebag for helpful
discussions of the integration of Machine Learning and Constraint Programming.
References
1. A. Arbelaez, Y. Hamadi, and M. Sebag. Online heuristic selection in constraint
programming. In Symposiumon Combinatorial Search (SoCS), 2009.
2. F. Boussemart, F. Hemery, C. Lecoutre, and L. Sais. Boosting systematic search
by weighting constraints. In ECAI’04, 2004.
3. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines
and other kernel-based learning methods. Cambridge University Press, 2000.
4. R. M. Haralick and G. L. Elliott. Increasing tree search efficientcy for constraint
satisfaction problems. In Artificial Intelligence, 1980.
5. F. Hutter and Y. Hamadi. Parameter adjustment based on performance prediction: Towards an instance-aware problem solver. Number MSR-TR-2005-125, Cambridge, UK, Jan 2005. Microsoft-Research.
6. E. O’Mahony, E. Hebrard, A. Holland, C. Nugent, and B. O’Sullivan. Using casebased reasoning in an algorithm portfolio for constraint solving. In AICS’08, 2008.
7. P. Refalo. Impact-based search strategies for constraint programming. In CP’04,
2004.
8. H. Samulowitz and R. Memisevic. Learning to solve qbf. In AAAI’07, 2007.
9. K. Smith-Miles. Cross-disciplinary perspectives on meta-learning for algorithm
selection. ACM Comput. Surv., 41(1), 2008.
10. L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. Satzilla-07: The design and
analysis of an algorithm portfolio for sat. In CP’07, 2007.
Constraint Based Languages for Biological Reactions
Marco Bottalico1 (student)
Stefano Bistarelli1,2,3 (supervisor)
13
1
3
Università G. d’Annunzio, Pescara, Italy
[bottalic,bista]@sci.unich.it
2
Istituto di Informatica e Telematica (CNR), Pisa, Italy
[stefano.bistarelli]@iit.cnr.it
Dipartimento di Matematica Informatica, Università di Perugia, Italy
bista@dipmat.unipg.it
Abstract. In this paper, we study the modelization of biochemical reaction by using concurrent constraint programming idioms. In particular we
will consider the stochastic concurrent constraint programming (sCCP),
the Hybrid concurrent constraint programming languages (Hcc) and the
Biochemical Abstract Machine (BIOCHAM).
Keywords: Biochemical Reactions, (Stochastic - Hybrid) Concurrent Constraint Programming, Biocham.
1 Introduction
System biology is a science integrating experimental activity and mathematical modeling. They study the dynamical behaviors of biological systems. While
current genome project provide a huge amount of data on genes or proteins, lots
of research is still necessary to understand how the different parts of a biological
system interact in order to perform complex biological functions. Mathematical
and computational techniques are central in this approach to biology, as they
provide the capability of formally describing living systems and studying their
proprieties.
A variety of formalisms for modeling biological systems has been proposed
in the literature. In [1], the author distinguishes three basic approaches: discrete, stochastic, continuous, additionally we have various combinations between
them. Discrete models are based on discrete variables and discrete state changes;
continuous models are based on differential equations that typically model
biochemical reactions; finally in the stochastic the probabilities may appear
explicitly in random variables and random numbers, or implicitly like in kinetics
laws. In the latest approach we have a simplified representation of processes,
an integration of stochastic noise in order to get more realistic models. The need
to capture both discrete and continuous phenomena, motivates the study of
dynamical systems [5].
The goal of this paper is to show different kinds of languages to model
biochemical reactions in order to compare and to use their appropriate features
in different ways.
2 Background on Biochemical reactions and Blood
Coagulation
14
In this paper, we want to examine the Biochemical Reactions: they are chemical
reactions involving mainly proteins. In a cell, there are many different proteins,
hence, the number of reactions that can take place, can be very high. All the
interactions that take place in a cell, can be used to create a diagram, obtaining
a biochemical reactions network.
We will examine in the following, one of the thirteen enzymatic reaction of
the blood coagulation [12], in the generic form, through the Michaelis-Menten
kinetics:
k1
XI : XIIa ⇀k2 XI + XIa
XI + XIIa ⇋
k
−1
where XI is the enzyme (E) that binds substrate (S) = XIIa, to form an enzymesubstrate complex (ES) = XI : XIIa. After we have the formation of product
(P) = XIa and the release of the unchanged enzyme (E) = XI, ready for a new
reaction.
We are interested of the Blood Coagulation [12]. It is part of an important
host defense mechanism termed hemostasis (the cessation of blood loss from
a damaged vessel). Blood clotting is a very delicately balanced system; when
hemostatic functions fail, hemorrhage or thromboembolic phenomena results.
The chemical reactions that constitute all the process, can be see as a decomposition of many kinds of enzymatic reactions, involving reactants, products,
enzymes, substrates, stoichiometric coefficients, proteins, inhibitors and chemical accelerators.
Under the Michaelis-Menten hypotheses [4], the most important equations
are:
2
KM = k−1k+k
Michaelis constant. It measures the affinity of the enzyme for the
1
substrate: if KM is small there is a high affinity, and viceversa.
VMAX = V0 = k2 [E0 ]. This is the maximum rate, would be achieved when all of
the enzyme molecules have substrate bound (Hp1). [E0 ] is the starting quantity
of enzyme E. k2 is also called kcat .
[S]
= KVMMAX
+ [S] . This final equation, is usually called the “Michaelis-Menten equation”. It shows the speed of the formation of the product.
When the amount of product P is small, this will be a good approximation
and the equations can now be integrated:
d [P]
dt
d [E]
dt = (k2 + k−1 )[ES] − (k1 )[E][S].
d [S]
S: dt = k−1 [ES] − (k1 )[E][S].
ES: d [ES]
dt = k1 [E][S] − (k−1 + k2 )[ES].
d [P]
P: dt = k2 [ES].
E:
3 Stochastic Concurrent Constraint Programming
sCCP [2] is obtained by adding a stochastic duration to the instruction inte15
racting with the constraint store C, i.e. ask and tell. The most important feature
added in the sCCP is the continuous random variable T, associated with each
instruction. It represents the time needed to perform the corresponding operations in the store. T is exponentially distributed, and its probability density
function is f (τ) = λe−λτ where λ is a positive real number (rate of the exponential
random variable) representing the expected frequency per unit of time. The
duration of an ask or a tell can depend on the state of the store at the moment
of the execution.
The main difference of sCCP with classical cc, is the presence of two different
actions with temporal duration, ask and tell, identified by a rate function λ:
tellλ (c) and askλ (c), following the probability law. It means that the reaction
occurs in a stochastic time T, f (τ) = λe−λτ whose mean is 1/λ; i.e. tell∞ is an
instantaneous execution while tell0 never occurs.
Other functions are the same that in CCP, except for the variables, that in CCP
are rigid, in the sense that, whenever they are instantiated, they keep that value
forever. Time-varying variables (called stream variables) can be easily modeled
in sCCP as growing lists with a unbounded tail: X = [a1 , ..., an |T]. When the
quantity changes, we simply need to add the new value, say b, at the end of
the list by replacing the old tail variable with a list containing b and a new tail
variable: T = [b|T′ ]. When we need to know the current value of the variable X,
we need to extract from the list, the value immediately preceding the unbounded
tail. The stream variables are denoted with assignment $=.
We model the biochemical equation [4], in sCCP, with the following recursively defined method:
react(XIIa, XIa, KM , V0 ) : −
askrMM (KM ,V0 ,XIIa) (XIIa > 0).
(tell∞ (XIIa $= XIIa − 1)||tell∞ (XIa $= XIa + 1)).
react(XIIa, XIa, KM , V0 )
Where the rate λ of the ask, is computed by the Michaelis-Menten kinetics:
V0 ×XIIa
rMM(KM ,V0 ,XIIa) = XIIa+K
. Roughly the program inserts in the store the current
M
value for the variables KM , V0 , XIIa; it checks the value of the factor XIIa, then,
with an immediate effect, it updates the values for the factors XIIa (reagent) and
XIa (product) with the new values. Subsequently it executes a new instance of
the program.
4 Hybrid cc
Hybrid concurrent constraint programming languages (Hybrid cc [8]) is a powerful framework for modeling, analyzing and simulating hybrid systems, i.e.,
systems that exhibit both discrete and continuous change. It is an extension of
Timed Default cc [11] over continuous time. One of the major difficulty in the
original cc framework is that cc programs can detect only the presence of information, not the absence [1]. Default cc extends cc by a negative ask combinator
16
(i f a else A) which imposes the constraint a at the program A.
e = 10,
s = 5,
es = 0.01,
p = 0,
always { k1 = 1, km1 = 0.1, k2 = 0.01,
cont(e), cont(s),
if (e >= 0.000000001) then {
e’ = ((km1+k2) * es) - (k1 * e * s),
s’ = (km1 * es) - (k1 * e * s) ,
es’ = (k1 * e * s) - ((km1+k2) * es),
p’ = k2 * es }
else { e’ = 0, s’ = 0, es’ = 0, p’ = 0 }
},
We can observe that in the first row, we have the initial conditions with the
quantity of Enzyme (100), Substrate (10), Enzyme-Substrate (0), Product (0),
and the rate of the three reactions: k1 = 1 for the first one, k−1 = 0.1 for the
second one and k2 = 0.01 for the third reaction. With the syntax cont(e, s, es, p)
we assert that the rates of e, s, es, p are continuous. Subsequently we control the
current values of e and s then we can start the reaction; the && and = operators,
has the usual meaning of “and” and “equal”. We can easily obtain the quantity
of e, s, es, p in the next time instant. If the “if condition” doesn’t hold, we obtain
the previous amount of factors.
5 Biochemical Abstract Machine
Biochemical Abstract Machine (BIOCHAM [6]) is a software environment for
modeling complex cell processes, making simulations (i.e. in silico experiments),
formalizing the biological properties of the system known from real experiments, checking them and using them as specification when refining a model.
BIOCHAM is based on two aspects: the analysis and simulation of boolean,
kinetic and stochastic model and the simulation of biological proprieties in temporal logic. For kinetics model, BIOCHAM can search for appropriate parameter
values in order to reproduce a specific behavior observed in experiments and
formalized in temporal logic.
We can use the Michaelis-Menten kinetics to represent the first enzymatic reaction of blood coagulation. To explain the language used to model the reaction
in the BIOCHAM [3] language, we can translate the syntax in the following way:
(k1*[E]*[S],km1*[ES]) for E + S <=> ES.
k2*[ES] for ES => E + P.
parameter(k1,1).
parameter(km1,0.1).
parameter(k2,0.01).
present(E,100).
present(S,10).
absent(ES).
absent(P).
17
There are two different syntax operator, used to model the different kinds of
reaction: <=> and =>. The first one model the reversible reaction, involved in
the ES formation, this reaction can be reversible. The second one model the
irreversible reaction which produces the P factor. The f or operator shows us for
k1
which substances, the reaction is performed: the first f or is for the E+S ⇋
E:S
k−1
k2
reaction, the second one is for the E : S ⇀ E + P reaction.
In BIOCHAM, the result of this simulation, generates a graph and a table,
in relation to time and nanomolar variation. In the graph (fig.1), we have the
plot of four kinds of curves, referring to the four substances involved in the
reaction: the initial decreasing of the Enzyme and of the Substrate (violet and
red curves respectively), the formation to the Enzyme-Substrate complex (green
curve) and finally the Product formation (sky blue line).
Fig. 1. Graphic
6 Related and future works
Most important features are represented in [10, 7]. In [10] the authors suggests
to model biomolecular process i.e. protein networks, by using the pi-Calculus,
while in [7] is shown that there are two formalisms for mathematically describing the time behavior of a spatially homogeneous chemical systems: the deterministic approach and the stochastic one. The first regards the time evolution
as a continuous and predictable process which is governed by a set of ordinary
differential equations (the “reaction-rate equations”), while the seconds regards
the time evolution as a kind of random-walk process which is governed by a
18
single differential-difference equation (the “master equation”).
From the application point of view, the examined languages allows the
biologist to model biological systems in a high-level and declarative way, using
different kinds of applications and languages construct that capture directly
a variety of biological phenomena. We are interest in the non Deterministic
process Calculus (ntcc [9]) because it is a concurrent constraint programming
which include time process with a graphic formation, in order to describe our
biochemical reactions.
In this paper we have modeled only one to the thirteen kinetic reactions
used in the blood coagulation; our future work is to extend the coagulation
chain using all the reactions, the goal is to compare the formation of fibrin in the
scientific literature, with previous results. The second step is the in silico study
of a drug which decrease the hepatic secretion of four clotting factors, and we
want to analyze the derived outcomes.
References
1. Alexander Bockmayr and Arnaud Courtois. Using hybrid concurrent constraint
programming to model dynamic biological systems. In ICLP, pages 85–99, 2002.
2. Luca Bortolussi and Alberto Policriti. Modeling biological systems in stochastic
concurrent constraint programming. Constraints, 13(1-2):66–90, 2008.
3. Nathalie Chabrier-Rivier, François Fages, and Sylvain Soliman. The biochemical
abstract machine biocham. In Proc. CMSB, pages 172–191, 2004.
4. Thomas Delvin. Texbook of biochemistry with clinical correlations. McGraw Hill Book
co., 2001.
5. Schaft A.J. van der and J.M. Schumacher. Introduction to Hybrid Dynamical Systems.
Springer-Verlag., 1999.
6. François Fages. Temporal logic constraints in the biochemical abstract machine
biocham. In Patricia M. Hill, editor, LOPSTR, volume 3901 of Lecture Notes in Computer Science, pages 1–5. Springer, 2005.
7. Daniel T. Gillespie. Exact stochastic simulation of coupled chemical reactions. In The
Journal of Physical Chemistry, pages 2340–2352, 1977.
8. Vineet Gupta, Radha Jagadeesan, and Vijay A. Saraswat. Computing with continuous change. Sci. Comput. Program., 30(1-2):3–49, 1998.
9. Julian Gutiérrez, Jorge A. Pérez, Camilo Rueda, and Frank D. Valencia. Timed
concurrent constraint programming for analysing biological systems. Electr. Notes
Theor. Comput. Sci., 171(2):117–137, 2007.
10. Aviv Regev, William Silverman, and Ehud Y. Shapiro. Representation and simulation of biochemical processes using the pi-calculus process algebra. In Proc. Pacific
Symposium on Biocomputing, pages 459–470, 2001.
11. Vijay A. Saraswat, Radha Jagadeesan, and Vineet Gupta. Timed default concurrent
constraint programming. J. Symb. Comput., 22(5/6):475–520, 1996.
12. Williams. Williams Hematology. McGraw Hill Book co., 2006.
19
Capturing fair computations on Concurrent
Constraint Language
Paola Campli (campli@sci.unich.it),
Supervisor: Stefano Bistarelli (bista@sci.unich.it)
University G.D’Annunzio - Pescara, Italy
Abstract. The study of concurrency and nondeterminism in programming languages is often related to fairness. It is a feature that should
be included in contexts where there are repetitive choices among alternatives and it is desirable that no alternative will be never taken or
postponed infinitely often.
The aim of my research is to guarantee equitable computations in Concurrent Constraint languages. This paper presents an extension of the
language related to the operator of parallelism using as criterion of selection a metric provided by a fair carpool scheduling algorithm. The new
operator of Parallelism (km ) is able to deal with a finite number (m) of
agents to guarantee a fair prosecution of the computation, assuring that
no (enabled) agents will be never executed.
Introduction
This paper contains an extension of Concurrent Constraint Programming (CCP)
in order to guarantee a fair criterion of selection between parallel agents and to
restrict or remove the possible unwanted behavior of a program.
This document is structured as follows: in the first section the common definition of fairness are briefly presented; in section two the Concurrent Constraint
language is described; in section three a fair CC version through an extension
of the parallelism’s operator based on the idea proposed by a fair carpooling
algorithm is presented.
1
Fairness in programming languages
Some of the most common notion of fairness in computer science are given by
Nissim Francez in [1]: weak fairness requires that if an action (or agent) is
continuously enabled (so it can almost always proceed) then it must eventually
do so, while strong fairness requires that if an action (or agent) can proceed
infinitely often then it must eventually proceed (be executed). In order to guarantee the above properties of fairness over the CC language we add a condition
(or guard) that enables the agents with a lower cost to succeed.
20
2
Concurrent Constraint Programming
The concurrent constraint (cc) programming paradigm [4] concerns the behavior
of a set of concurrent agents with a shared store, which is a conjunction of
constraints ( relations among a specified set of variables). Each computation step
possibly adds new constraints to the store. Thus information is monotonically
added until all agents have evolved. The final store is a refinement of the initial
one and is the result of the computation. The concurrent agents communicate
with the shared store, by either checking if it entails a given constraint (ask
operation) or adding a new constraint to it (tell operation).
The syntax of a cc program is show in Table 1: P is the class of programs, F is the
class of sequences of procedure declarations (or clauses), A is the class of agents, c
ranges over constraints, and x is a tuple of variables. The + combinator expresses
nondeterminism. We also assume that, in p(x) :: A, vars(A) ⊆ x, where vars(A)
is the set of all variables occurring free in agent A. In a program P = F.A, A is
the initial agent, to be executed in the context of the set of declarations F .
The intuitive behavior of the agents is: agent “success” succeeds in one step;
Table 1. cc syntax
P ::= F.A
F ::= p(x) :: A | F.F
A ::= success | f ail | tell(c) → A | E | AkA | ∃x A | p(x)
E ::= ask(c) → A | E + E
agent “f ail” fails in one step; agent “ask(c) → A” checks whether constraint
c is entailed by the current store and then, if so, behaves like agent A. If c is
inconsistent with the current store, it fails, and otherwise it suspends, until c is
either entailed by the current store or is inconsistent with it; agent “ask(c1 ) →
A1 + ask(c2 ) → A2 ” may behave either like A1 or like A2 if both c1 and c2
are entailed by the current store, it behaves like Ai if ci only is entailed, it
suspends if both c1 and c2 are consistent with but not entailed by the current
store, and it behaves like “ask(c1 ) → A1 ” whenever “ask(c2 ) → A2 ” fails (and
vice versa); agent “tell(c) → A” adds constraint c to the current store and then,
if the resulting store is consistent, behaves like A, otherwise it fails; agent A1 kA2
behaves like A1 and A2 executing in parallel; agent ∃x A behaves like agent A,
except that the variables in x are local to A; p(x) is a call of procedure p.
Here is a brief description of (some of) the transition rules:
21
htell(c) → A, σi → hA, σ ⊗ ci
tell
σ⊢c,
hAsk(c)→A,σi→hA,σi
ask
′
′
hA1 ,σi→hA1 ,σ i
′
hA1 kA2 ,σi→hA1 kA2 ,σ ′ i
parallelism (1)
′
hA1 ,σi→hsuccess,σ i
hA1 kA2 ,σi→hA2 ,σ ′ i
parallelism (2)
′
hE1 ,σi→hA1 ,σ i
hE1 +E2 ,σi→hA1 ,σ ′ i
nondeterminism
hA[y/x],σi→hA1 ,σ ′ i
h∃x A,σi→hA1 ,σ ′ i
hidden variables
hp(y), σi → hA[y/x], σi where
p(x) :: A
procedure call
Notice that || and + are commutative and associative operators.
3
Fair Concurrent Constraint Programming
The obtain a Fair version of the parallel execution, we modify the parallel operator k to use quantitative metrics that provides a more accurate way to establish
which of the agents can succeed. The metric we use is the one proposed by a Fair
carpooling scheduling algorithm [2], which is described more in detail below.
3.1
The fair carpool scheduling algorithm
Carpooling consists in sharing a car among a driver and one or more passengers
to divide the costs of the trip. We want a scheduling algorithm that will be perceived as fair by all the members as to encourage their continued participation.
Let U be a value that represents the total cost of the trip. It is convenient (but
not necessary) to take U to be the least common multiple of [1, 2, . . . , m] where
m is the largest number of people who ever ride together at a time in the carpool.
Since in a determined day the participans can be less than m, we define n as the
number of participants in the carpooling in a given day (n ∈ [1 . . . m]).
Each day we calculate the passengers and driver’s scores. In the first day
each member has score zero; in the following days the driver will increase his
score by αn = U (n − 1)/n , while the remaining n − 1 passengers decrease their
score by βn = U/n
As proved by [2] this algorithm is fair.
22
3.2
Applying the carpooling algorithm over CC
A problem we encounter on the CC semantic is that only two parallel agents are
represented in the Parallelism rule; therefore it is not possible to express fairness
with more than 2 agents, since we need to associate a value (cost or preference)
to each agent.
In this section the syntax and the semantic of the language will be extended
with the new operator km in order to model the parallel execution of m agents
(with m finite). Moreover we include an array k (with m elements) that allows us
to keep track of the actions performed by each agent during the computational
steps (we use the notation hAi , k[i], σi instead of hAi , σi). In this way we can
associate a value k[i] to each agent Ai . The syntax will be modified as in Table 2.
Table 2. fair version of cc syntax
P ::= F.A
F ::= p(x) :: A | F.F
A ::= success | f ail | tell(c) → A | E |
E ::= ask(c) → A | E + E
km (A1 . . . Am ) | ∃x A | p(x)
With reference to the carpooling algorithm we consider the participants in
the carpool in a given day as the n enabled agents (Al1 →, . . . , Aln → ) and we
represent the driver with the agent Ali , while the passengers are the remaining
Alj (j ∈ [1, . . . , n]) agents.
Agent Ali increases his score by αn , while the n − 1 passengers decrease their
score by βn . Initially all the n elements of the array k are equal to 0.
We add in the new transition rule the set of values associated to each agent
by the array k; we also insert in the precondition of the rule a guard (k[i] ≤ k[j])
to compare the agent’s values and to establish which of the agents can evolve or
succeed.
′
In the initial phase, the enabled agent Ali evolves in Ali with value αn . The
other agents instead assume the value −βn .
In next steps we update the values of the array k to obtain k′ by increasing/decreasing the previous values of k by αn /βn . We obtain the transition rule
in table 3.
Since km is an associative and commutative operator, we can omit the rewriting of the second rule of Parallelism (1).
We use the same criterion also for the Parallelism (2) rule (table 4).
If agent Ali succeeds, it is deleted by the rule together with the corresponding element k[i]. Consequently the value m of the operator km is continuously
23
Table 3. Carpooling fairness - parallelism (1)rule
k[li ] ≤ k[lj ]
Al1 →, . . . , Aln →
j = 1, . . . , n < Ali , σ >→< A′li , σ ′ >
∀
< ||m (A1 , . . . , Ali , . . . , Am ), k, σ >→< ||m (A1 , . . . , A′li , . . . , Am ), k ′ , σ ′ >
′
k[x] =
(
k[x]
if x = 1, . . . , m
k[x] + αn if x = li
k[x] − βn if x = lj
j 6= i
Table 4. Carpooling fairness - parallelism (2) rule
k[li ] ≤ k[lj ]
∀
Al1 →, . . . , Aln →
j = 1, . . . , n < Ali , σ >→< success, σ ′ >
< ||m (A1 , . . . , Ali , . . . , Am ), k, σ >→< ||m−1 (A1 , .., Ali−1 , Ali+1 , .., Am ), k ′ , σ ′ >
k[x]′ =
½
k[x] if x 6= li
null if x = li
decreased by 1 every time an agent performs a success. We obtain therefore a
final computation with a single agent and a single element on the array that
reinstate the original rule: hk1 (A1 )[k1 ], σi → hA1 , σi.
Acknowledgement
We wish to thank Paolo Baldan for his help in discussing and improving the
contents of the paper.
4
Future Works
The existing definitions of fairness in the literature (strong fairness, weak fairness, unconditional fairness, equifairness) yield the development of various fairnesslike constructs, that vary according to the computational model considered. The
limits we can observe in such expressions are that they can be used to model
only infinite computations. In fact, when dealing with finite computations, it is
not possible that an agent is continuously or infinitely often enabled; for this
reasons we need a quantitative and more accurate notion of fairness.
As proved by [3], it is not possible to write an expression which captures all
the fair behavior of a system in a programming language with a primitive to
express finitary choice because to express fairness unbounded nondeterminism
is needed.
24
The aim of our research is to provide new notions of fairness suitable for
finite computations and to guarantee equitable computations in the Concurrent
Constraint language and its extensions.
Moreover we plan to provide quantitative valuations by using soft constraints;
we will use a semiring structure to measure how much the service is fair. To do
this we plan to use the same indexes that are used to measure the inequality in
economics, such as the Gini coefficient that indicates the inequality of income or
wealth.
References
1. Nissim Francez, FAIRNESS (text and monographs in computer science), isbn:
0387962352, Springer Verlag (1986)
2. Ronald Fagin and John H.Williams, A Fair Carpool Scheduling Algorithm, International Business Machine Corporation, 1983, pages 133–139
3. Abha Moitra, Prakash Panangaden, Finitary choice cannot express fairness: A metric space technique, 1986
4. Vijay A. Saraswat and Martin Rinard, Concurrent constraint programming, POPL
1990
25
Finding Stable Solutions in Constraint Satisfaction
Problems
Laura Climent, Miguel A. Salido and Federico Barber
Instituto de Automática e Informática Industrial
Universidad Politécnica de Valencia.
Valencia, Spain
Abstract. Constraint programming is a successful technology for solving combinatorial problems modeled as constraint satisfaction problems (CSPs). An important extension of constraint technology involves problems that undergo changes
that may invalidate the current solution. These problems are called Dynamic
Constraint Satisfaction Problems (DynCSP). Many works on dynamic problems
sought methods for finding new solutions. In this paper, we focus our attention on
finding stable solutions which are able to absorb changes. To this end, each constraint maintains a label that measures the degree of variability. Thus, we restrict
the original search space by using these labels and solutions inside the restricted
search space remain valid under changes.
1 Introduction
Nowadays many real problems can be modeled as constraint satisfaction problems and
are solved using constraint programming techniques. Much effort has been spent to increase the efficiency of the constraint satisfaction algorithms: filtering, learning and distributed techniques, improved backtracking, use of efficient representations and heuristics, etc. This effort resulted in the design of constraint reasoning tools which were used
to solve numerous real problems. However all these techniques assume that the set of
variables and constraints which compose the CSP is completely known and fixed. This
is a strong limitation when dealing with real situations where the CSP under consideration may evolve.
Dynamic Constraint Satisfaction Problem [2] is an extension to a static CSP that
models addition and retraction of constraints and hence it is more appropriate for handling dynamic real-world problems. It is indeed easy to see that all the possible changes
to a CSP (constraint or domain modifications, variable additions or removals) can be
expressed in terms of constraint additions or removals [4].
Several techniques have been proposed to solve Dynamic CSPs, including: searching for stable solutions that are valid after small problem changes [5]; searching for a
new solution that minimizes the number of changes from the original solution [3]; or
reusing the original solution to produce a new solution [4].
– Searching for stable solutions that are valid after small problem changes has the
objective of exploring methods for finding solutions that are more likely to remain
valid after changes, because this fact means that these are stable solutions [5]. In
26
order to develop this method, it is necessary to find all the solutions if we want to
obtain the most stable based on an objective function.
– Searching for a new solution that minimizes the number of changes from the original solution, considers the stability like close solutions in case of changes [3].Nevertheless with this method we do not obtain stable solutions.
– Reusing the original solution to produce a new solution consist on producing the
new solution by local changes on the previous one [4]. In order to develop this
technique, we must explore the search space.
In this paper we focus our attention on the first set of techniques: Searching for stable
solutions that are valid after problem changes. Thus, we assume that the user knows
the degree of stability of each constraints and they can be labeled to identify this stability.Thus the constrainedness of CSP can be altered by modifying/restricting the dynamic constraints.
1.1 Definitions
By following standard notations and definitions in the literature [1] we have summarized the basic definitions that we will use in this rest of the paper.
Definition 1. Constraint Satisfaction Problem (CSP) is a triple P = hX, D, Ci
where, X is a finite set of variables {x1 , x2 , ..., xn }. D is a set of domains D =
D1 xD2 x...xDn such that for each variable xi ∈ X there is a finite set of values that
variable can take. C is a finite set of constraints C = {C1 , C2 , ..., Cm } which restrict
the values that the variables can simultaneously take.
Definition 2. Instantiation is a pair (xi = a), that represents an assignment of the
value a to the variable xi , and a ∈ Di .
Definition 3. To Check a Constraint: Once all variables, involved in a constraint,
are instantiated, the constraint can be evaluated/check to analyze whether these assignments of values to variables satisfy the constraint or not.
Definition 4. The Solution Space (S): is the portion of search space that satisfies
all constraints. The instantiations of all variables that satisfy all constraints are inserted
in the solution space. If all constraints are convex, the solution space is a convex hyperpolyhedron, which is included in the Cartesian Product of the variable domains.
2 Finding Stable Solutions
Our aim is to find solutions to problems that remain valid under changes. Many real
problems, such as planning and scheduling problems, can be modeled as a CSP. However, many of the constraints are set knowing that they are probably to be modified. For
instance, in scheduling problems, if a worker starts working at time T 1begin and his
first task must be finished in 50 minutes, then the constraint is (T 1end − T 1begin < 50).
However, the worker could arrive late to work due to many factors. Thus, we can restrict this constraints to (T 1end − T 1begin < 40). If something is wrong the task can be
delayed by more minutes. Thus, we can label this constraint as a dynamic constraint of
level 2 (for instance). If the problem is feasible with the more restricted constraints, the
27
obtained solutions will be more stable than the solutions of the original problem due to
the fact that if the worker arrives late (<10 minutes) the solution remains valid. If the
problem is not feasible, the dynamic constraints can be relaxed until they became the
original ones. It must be taken into account that the search space previously explored
must not be explored again due to the fact that we can add the inverse constraints to
avoid searching the previous explored region. For instance, in the previous example, if
the constraint (T 1end − T 1begin < 40) remains the problem infeasible, then we add the
constraint (T 1end − T 1begin ≥ 40) and relax the original one (T 1end − T 1begin < 45).
This modeling technique will be appropriate mainly in under-constrained problems.
Without loss of generality we consider four levels of stability for the constraints:
– Level 0 means that the constraint is very stable so there is no probability that it
changes (no dynamic constraint).
– Level 1 means that the constraint is stable so the probability that it changes is low.
– Level 2 means that the constraint is low stable so the probability that it changes is
high.
– Level 3 means that the constraint is very unstable so the probability that it changes
is very high (very dynamic constraint).
The level of probability to change of the domains in a CSP can be also represented
for these levels of stability explained before. In the lowest level the domains cannot
change. However in the highest level it is very probable a reduction in the domain size.
2.1 By Restricting Constraints
In order to find stable solutions by restricting constraints, each constraint is labeled with
one of the previous integer numbers. We assume constraints in the form Ax − k ≥ 0 →
a1 x1 + a2 x2 + ... + an xn − k ≥ 0. Thus, with each constraint Ci and its label li , we
generate a new constraint Ci′ : Ax − (li d + k) ≥ 0, where d is a parameter that the user
gives to restrict the constraint. This number must be higher than zero. For a value of
zero the CSP’ will be the same than the original CSP. The parameter d allows the user
to restrict the constraints in a high or low level, it is very useful because if the CSP’
has not solution, the user can relax the constraints. Thus if li = 0 then the constraint
remains the original one Ci′ : Ax − (0d + k) ≥ 0 → Ci′ : Ax − k ≥ 0.
Input: A CSP(P), P = hX, D, Ci: C = {C1 , C2 , ..., Cm } and a set of labels L = {l1 , l2 , ..., lm }.
Output: level of stability of solutions Sol and Sol′ .
1. for every constraint Ci and label li
2.
Generate a new constraint Ci′
3. endfor
′
4. C ′ = {C1′ , C2′ , ..., Cm
}
5. Compose a CSP(P’) P ′ = hX, D, C ′ i
6. Sol ← Solve CSP(P)
7. if (∃Sol)
8. do
9.
Sol′ ← Solve CSP’(P’)
10.
if ¬∃ Sol′
11.
Relax CSP’ ← P ′
28
12.
endif
13. while(¬∃Sol′ )
14.
if (Sol 6= Sol′ )
15.
Check − Stability(Sol, Sol′ )
15.
return Stability(Sol), Stability(Sol′ )
16. else (No Solution exists)
16. end
Fig. 1. Example of CSP (left). Resultant CSP once constraints are modified by their labels (right).
2.2 By Reducing Domain Size
This method consists in reducing the domain size of the variables in the upper and lower
bound. This technique is similar to the previous one by adding new unary constraints to
the set of constraints. In this case, the user can also select the stability of each variable.
These values are represented by labels that are used to insert unary constraints into the
system. Figure 2 left shows the previous example with the original domains and Figure
2 right shows the same problem with the reduced domain sizes. Thus, the solution space
is reduced, and solutions inside this solution space are more stable.
3 Example
In the following, we present an example to clarify the stability methods presented in the
previous section. For simplicity, we consider a CSP with only to variables x1 and x2
with domains D1 : {3..7} and D2 : {2..6} (see Figure 1). The constraints are:
C1 : x1 + x2 ≥ 6
C2 : x1 + x2 ≤ 12
C3 : x1 − x2 ≤ 4
C4 : −x1 + x2 ≤ 2
29
Figure 1 left shows a graphic representation of the original problem. It can be
observed that the Cartesian Product of the domains generates 25 different instantiations. The original constraints removes the non-valid instantiations (x1 = 3, x2 = 2),
(x1 = 7, x2 = 2), (x1 = 3, x2 = 6),(x1 = 7, x2 = 6). Thus, the rest of 21 instantiations are solutions of the CSP. If a CSP solver gives us a subset of solutions, which one
is more robust? To answer this question, we will make use of the constraint labels.
In Figure 1 (left) we present the four constraints with its corresponding labels:
C1 (0), C2 (2), C3 (1), C4 (1). If we consider d = 1, the resultant constraint are:
C1 (0) : x1 + x2 ≥ 6 → C1′ : x1 + x2 ≥ 6,
C2 (2) : x1 + x2 ≤ 12 → C2′ : x1 + x2 ≤ 10,
C3 (1) : −x1 + x2 ≤ 2 → C3′ : x1 + x2 ≤ 1,
C4 (1) : x1 − x2 ≤ 4 → C4′ : x1 − x2 ≤ 3.
The resultant CSP P ′ is shown in Figure 1 (right). It can be observed that constraint
C1 has a label l0 = 0 so it is a static constraint. The rest of constraints have labels 1
and 2. This means that they are dynamic constraints and a tighter CSP is possible. The
new constraints remove some non-valid instantiations, for instance (x1 = 7, x2 = 5),
(x1 = 7, x2 = 4), (x1 = 6, x2 = 6),(x1 = 6, x2 = 5). These instantiations that are
solutions to the original CSP are candidates that remain invalid if constraint C2 changes.
As can be seen in Figure 1 (right), the solution space has been reduced and the set of
solutions S ′ ⊂ S can be considered more robust than the set of solutions S\S ′ .
Fig. 2. Resultant CSP once domain size are reduced
4 Conclusions and Further Work
Many real problems can be modeled as Dynamic Constraint Satisfaction Problem. Several techniques have been developed to manage Dynamic CSPs. Some of them are based
on searching for stable solutions that are valid after small problem changes. However,
it is a very hard task due to the fact that all solution must be found to obtain the most
30
stable one based on an objective function. Other techniques are based on searching for
a new solution that minimizes the number of changes from the original solution. These
techniques are appropriate for dynamic CSPs but the obtained solutions are not stable.
Finally some techniques are based on reusing the original solution to search a new solution. In this case, a CSP solver must explore the search space for finding a solution
for the new problem.
In this paper, we focus our attention on finding stable solutions by restricting the
original CSP with the aim of obtaining solutions able to absorb changes. To this end,
we have develop a tool for simulating a CSP’ with tighter constraints and/or domains,
depending on the dynamic of each constraint. Thus, solutions of this CSP’ remain stable
solutions for the original CSP.
There is a long line of work centered in this area. Using the fact that the reduced
CSP (CSP ′ ) has tighter constraints, filtering techniques such as arc-consistency can
reduce the domain size of the variables. So we can apply some strategies to find even
more stable solutions:
– by carrying out a value ordering. Once filtering techniques such as node-consistency,
arc-consistency are carried out and the domains have been reduced by the previous
techniques, a value ordering heuristic can be applied to starting the search by the
center of the search space. Thus, given a domain [di , Di ] of a variable xi , the search
i
starts by assigning the value xi = di +D
, that is, by the center of the interval. For
2
instance, in the example of Figure 2, the search will start by assigning the value
(x1 = 5, x2 = 4).
– by searching for several solutions and carrying out convex combinations between
two solutions. This technique is only valid for CSPs with convex constraints and
continuous domains. Once the constraints have reduced the solution space, the more
internal solutions are candidates to be more stable. Thus a convex combination between two opposite extremal points is probably more stable than its owns extremal
points.
References
1. Christian Bessiere. Constraint propagation. Technical report, CNRS/University of Montpellier, 2006.
2. R. Dechter and A. Dechter. Dynamic constraint networks. In Proc. of the 7th National Conference on Artificial Intelligence (AAAI-88), pages 37–42, 1988.
3. E. Hebrard, B. Hnich, and T. Walsh. Super solutions in constraint programming. In Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization
Problems. CPAIOR-04, pages 157–172, 2004.
4. G. Verfaillie and T. Schiex. Solution reuse in dynamic constraint satisfaction problems. In
In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94), pages
307–312, 1994.
5. R. Wallace and E.C. Freuder. Stable solutions for dynamic constraint satisfaction problems.
In Proc. 4th International Conference on Principles and Practice of Constraint Programming,
pages 447–461. Springer, 1998.
31
Preliminary studies of BnB-ADOPT related with Soft
Arc Consistency ⋆
Patricia Gutierrez (student) and Pedro Meseguer (supervisor)
IIIA, Institut d’Investigació en Intel.ligència Artificial
CSIC, Consejo Superior de Investigaciones Cientı́ficas
Campus UAB, 08193 Bellaterra, Spain.
{patricia|pedro}@iiia.csic.es
Abstract. BnB-ADOPT is a distributed asynchronous search algorithm for Distributed Constraint Optimization Problem (DCOP) solving. When searching, an
agent may realize that some of its values will never be in the solution. In that case,
these values are unconditionally deleted and their deletions are propagated. Using soft arc consistency techniques, more values in other agents could be pruned
as consequence of this propagation. This causes reductions in the search effort
required to solve DCOP instances. In this paper we propose the idea of an algorithm based on BnB-ADOPT that identifies and propagates unconditional value
deletions as search progresses, expecting an improvement in the overall search
effort.
1
Introduction
A classical problem in computer science is finding the global optimum of the aggregation of some elementary functions. When these elementary functions are distributed
among several agents, and cannot be centralized into one agent (due to privacy issues,
the natural distributed origin of the problem data and the high costs of its translation,
among others), this problem is called Distributed Constrained Optimization (DCOP).
This problem has received a lot of attention in the last years. DCOP solving algorithms,
such as ADOPT [5] or DPOP [6] have been developed. Considering DCOP solving
by distributed asynchronous search, BnB-ADOPT [8] offers better performance than
ADOPT while keeping its good theoretical properties (correctness, completeness, termination). It requires a relatively simple implementation.
During the search process, some agents executing BnB-ADOPT may discover that
some values will never be in the solution. In these cases, we propose to remove those
values from the DCOP instance, which when propagated, may cause further value removals. As the search space becomes smaller, we expect to get some benefits (in terms
of the number of messages exchanged or the number of cycles required) from this approach. We propose to propagate these value removals using a new message type. Maintaining the simplest form of soft arc consistency will allow us to prune further values.
⋆
Supported by the Spanish project TIN2006-15387-C03-01.
32
The paper is organized as follows. In section 2, we provide the basic definitions and
a description of BnB-ADOPT. In section 3, we present the idea of propagating unconditional deletions within the BnB-ADOPT algorithm by applying soft arc consistency
techniques.
2
Preliminaries
In a centralized setting, a Constraint Optimization Problem (COP) involves a finite
set of variables, each taking a value in a finite domain [1]. Variables are related by
cost functions that specify the cost of value tuples on some variable subsets. Costs
are positive natural numbers (including 0 and ∞). A finite COP is (X, D, C) where,
X = {x1 , . . . , xn } is a set of n variables; D = {D(x1 ), . . . , D(xn )} is a collection of
finite domains; D(xi ) is the initial set of xi possible values; C is a set of cost functions;
fi ∈ C on the ordered set of variables var(fi ) = (xi1 , . . . , xir(i) ) specifies the cost of
Q iri
every combination of values of var(fi ), that is, fi : j=i
D(xj ) 7→ N + . The arity
1
of fi is |var(fi )|. The overall cost of a complete tuple (involving all variables) is the
addition of all individual cost functions on that particular tuple. A solution is a complete
tuple with acceptable overall cost, and it is optimal if its overall cost is minimal.
Moving into a distributed context, a Distributed Constraint Optimization Problem
(DCOP), is a COP where variables, domains and cost functions are distributed among
automated agents [5]. A variable-based DCOP is a 5-tuple (X, D, C, A, α), where
X, D, C define a COP, A is a set of p agents and α maps each variable to one agent.
For simplicity, we assume that each agent holds exactly one variable (so the terms variable and agent can be used interchangeably) and cost functions are unary and binary
only (in the following a cost function is denoted as C with the indexes of variables involved). DCOPs can be solved by ADOPT, as the reference algorithm for asynchronous
distributed search [5]. Aiming at improving its performance, several ADOPT-based algorithms have been proposed, like BnB-ADOPT [8] and ADOPT-ng [7]. This paper
deals with BnB-ADOPT, so we provide a brief summary of this algorithm.
BnB-ADOPT performs a distributed depth-first branch and bound search. It appears
in Figure 1. It uses message passing as the communication form between agents, using
three message types: VALUE, COST and STOP. Agents are arranged in a pseudotree as
defined in [5]. A generic agent self executes BnB-ADOPT with these data structures:
the current context (values that self believes that are assigned to higher agents in the
branch from self to the root), and for each value of its domain and each child agent:
a lower bound lb, an upper bound ub and a context. With this self calculates its own
lower bound LB and upper bound U B based on its local cost plus any cost reported by
its childrenPas follows:
P
LB[d] ← (xi ,di )∈myContext Cself,i (myV alue, di )+ xk ∈myChildren lb[myV alue, xk ];
P
P
U B[d] ← (xi ,di )∈myContext Cself,i (myV alue, di )+ xk ∈myChildren ub[myV alue, xk ];
U B ← mind∈Dself U B[d].
LB ← mind∈Dself LB[d];
33
procedure BnB-ADOPT()
myContext ← tableContexts ← empty table;
for each xi ∈ myChildren ∧ v ∈ D(self ) do lb[v, xi ] ← 0; ub[v, xi ] ← ∞;
end ← false;
InitSelf(); Backtrack();
while (¬end) do
while message queue is not empty do
msg ← getMsg();
switch(msg.type)
V ALU E:ProcessValue(msg); COST :ProcessCost(msg);
ST OP : end ←true;
Backtrack();
procedure ProcessValue(msg)
if (myContext[msg.sender] 6= msg.value) then
myContext[msg.sender] ← msg.value; CheckCurrentContextWithChildren(); InitSelf();
if msg.sender = myP arent then T H ← msg.threshold;
procedure InitSelf()
myV alue ← argminv∈Dself LB(v);
T H ← ∞;
procedure ProcessCost(msg)
contextChange ← false;
for each xi ∈ msg.context, xi 6∈ myN eighhbors do
if myContext[xi ] 6= msg.context[xi ] then
myContext[xi ] ← msg.context[xi ]; contextChange ← true;
if contextChange = true then
CheckCurrentContextWithChildren(); InitSelf();
if isCompatible(myContext, msg.context) then
lb[msg.context[self ], msg.sender] ← max{msg.lb, lb[msg.context[self ], msg.sender]};
ub[msg.context[self ], msg.sender] ← min{msg.ub, ub[msg.context[self ], msg.sender]};
tableContext[msg.context[self ], msg.sender] ← msg.context;
procedure Backtrack()
if T H ≤ LB() then T H ← ∞;
if LB(myV alue) ≥ min(T H,UB()) then myV alue ← argminv∈Dself LB(v);
SendValueMessageToLowerNeighbors(myV alue);
if (self = root∧ LB() = UB()) ∨ end then
end ← true;
for each child ∈ myChildren do sendMsg(ST OP, self, child);
else sendMsg(COST, self, myP arent, myContext,LB(), UB());
procedure CheckCurrentContextWithChildren()
for each val ∈ D(self ) ∧ child ∈ myChildren do
if ¬isCompatible(myContext, tableContexts[val, child]) then
tableContexts[val, child] ← empty; lb[val, child] ← 0; ub[val, child] ← ∞;
procedure SendValueMessageToLowerNeighbors(myV
alue)
P
cost ←
j∈myP arent∪myP seudoparents Cself,j (myV alue, myContext[j]);
for each child ∈ myChildren do P
th ← min{T H,UB()} − cost − j∈myChildren,j6=child lb[myV alue, j];
sendMsg(V ALU E, self, child, myV alue, th);
Fig. 1. The original BnB-ADOPT algorithm.
self also maintains a threshold T H used for pruning during depth-first search, initialized to ∞ (and in the root agent remains always infinite). Variable myValue stands
for the value of the agent current assignation, and myChildren stands for the children in
the pseudotree as defined in [5]. At the beginning, self chooses the value that minimize
its LB and send a VALUE to its children and pseudochildren. In response, a child sends
a COST to its parent reporting the child LB and U B under the current context. When
34
self receives a VALUE, this indicates that its parent or pseudoparent has changed its
value and self context is updated. self checks its context against each child context: if
they are not compatible the information provided by that child (stored in the lb and ub
tables) is reinitialized and the current value may change. The threshold value sent by
the parent is updated.
When self receives a COST, a child is informing self of its LB and U B. As these
bounds are calculated depending on values of higher variables the child must attach the
context under which these costs were calculated. self checks the received context and
adds any higher variable assignation which its is not constrained with to its own context. If the sent context is compatible with its own context, self updates the information
received in the lb and ub tables, otherwise it ignores it.
Once every message in the self input queue is processed, self performs backtrack,
where it will decide if it must change its value. If the LB of the current value is greater
or equal to min(T H, U B()), self changes its value to the one with smallest LB. Notice
that on first iterations the U B for an agent will be infinite until every child has informed
with its corresponding COST (so an agent will not backtrack until all information is received), but on later iterations it will not be necessary to wait for every child to inform
its cost, since an agent can decide to discard the current value with partial information
if the LB of this value is greater than the stored T H, which work as a global upper
bound for that child subtree. In any case, the agent will send a VALUE to each child
ch with its current value and the desired threshold,
calculated as, min(T H, U B()) −
P
P
C
(myV
alue,
d
)−
self,i
i
xk ∈myChildren,xk 6=ch lb[myV alue, xk]
(xi ,di )∈myContext
This desired threshold will be used for pruning once the child has reached this value,
it is chosen such that the LB(d) for the agent reaches min(T H, U B()) and therefore,
the agent will take a new value when the received cost of the children reach the desired
threshold. In the next step the terminate condition is checked, this condition is triggered
by the root when LB = U B. This condition can only be achieved when all values in
the root domain have been explored or pruned. Finally, a COST is sent to the parent.
3
Propagating Unconditional Deletions
Let us consider a DCOP instance, where agents are arranged in a pseudotree and each
executes BnB-ADOPT. Imagine the root agent, Droot = {a, b, . . .}. Let us assume
that it takes value a. After a while, root will know cost(a) = lb(a) = ub(a), and it
decides to change its assignment to b, informing to its children with the corresponding VALUE messages. Children start answering about the cost of b (this is a change
of context in BnB-ADOPT terms) with COST messages. As soon as root realizes that
cost(b) > cost(a), b can be removed from Droot since it will not be in the solution
and it will never be considered again (a similar situation happens if cost(a) > cost(b),
then a can be removed from Droot ). Just removing b from Droot will cause no effect in
BnB-ADOPT, because it will not consider b again as possible value for root. However,
if we inform constrained agents that b is no longer in Droot , this may cause some values
35
of other agents to become unfeasible so they can be deleted as well.
A related situation may happen when the cost of the best solution found so far decreases, passing from ⊤1 to ⊤2 . It may occur that a value a of agent i which could
not be pruned with ⊤1 (Ci (a) < ⊤1 ) now it can be pruned with ⊤2 (Ci (a) ≥ ⊤2 ).
These deletions can be further propagated in the same way, decrementing the size of
the search space. It is worth noting that these deletions (either coming from root or
from any other agent) are unconditional, since (i) root values does not depend on any
context (contexts at root are always empty because there is no any higher agent), (ii)
unitary constraints on values of variables do not depend on any context. Any deletion
caused by propagation of unconditional deletions is also unconditional.
To propagate these value deletions in other agents we need to maintain soft arc
consistency in the distributed instance. In the centralized case, let us consider variables
i and j with unitary costs Ci and Cj , and constrained by cost function Cij . The simplest
form of soft arc consistency for weighted CSP (WAC) is defined as follows:
1. value (i, v) is node consistent (NC) if Ci (v) < ⊤, variable i is NC if every value is
NC, problem instance is NC if every variable is NC;
2. value (i, v) is arc consistent (AC) with respect to cost function Cij if it is NC and
there exists another value w ∈ Dj such that Cij (v, w) = ⊥. Value w is called a
support for v. Variable i is AC if all its values are AC, and and a problem instance
is AC if every variable is AC [3].
Values ⊤ and ⊥ represent the first unacceptable cost (any cost lower than ⊤ is acceptable) and no cost (with the usual addition for cost aggregation, as search progresses
⊤ is the global cost of the best solution found so far and ⊥ is 0). This definition can be
translated into the distributed context: variables are agents, each agent keeps its unitary
and binary constraint in which it is involved (so every agent knows the domain of every
other agent it is constrained with). Node consistency remains unchanged, enforced by
the owner agent, and arc consistency also remains unchanged, enforced by any of the
two agents involved in the constraint.
To apply this idea to DCOPs, we assume that the distributed instance is initially
WAC (otherwise, it can be made WAC by preprocessing). If value a is unconditionally
deleted from Di , it might happens that value a ∈ Di were the only support of a value
b ∈ Dj . For this reason, after the notification of a deletion, directional WAC has to be
enforced on cost function Cij from i to j (observe that in the other direction enforcing
will cause no change). This has to be done in both agents i and j, to assure that both
maintain the same representation of Cij . In addition, agent j may pass binary costs to
unary costs, which might result that some value b ∈ Dj becomes node inconsistent. In
that case, b should be deleted from Dj and its deletion should be propagated.
The idea of propagating unconditional deleted values can be included in BnBADOPT, producing a new algorithm and exploiting the idea that a constraint Cij is
36
known by both agents i and j. This approach would require some minor changes with
respect to BnB-ADOPT:
1. the domain of every variable constrained with self has to be represented in self ;
2. a new message type, DEL, is required. When self deletes value a in D(self ), it
sends a DEL message to every agent constrained with it. When self receives a
DEL message, it registers that the message value has been deleted from the domain
of sender, and it enforces soft arc consistency on the constraint between self and
sender. If, as result of this enforcing, some value is deleted in D(self ) it is propagated as above.
Including the propagation of unconditionally deleted values does not changes the
semantic of original BnB-ADOPT messages. In addition, T OP should be included in
VALUE messages: from the second value tried in root, T OP is a finite value which
is propagated downwards, informing the other agents of the lowest unacceptable cost.
It is worth noting that this approach keeps the good BnB-ADOPT properties, namely
correctness, completeness and termination: since we are eliminating values which are
either suboptimal (values at root) or not WAC, their removal will not cause to miss
any solution. If the value assigned to self is found to be not WAC, this is recorded, the
value is removed when possible and another value is tried for self. Any value removal
is propagated to agents constrained with self.
Connecting BnB-ADOPT with the simplest form of soft arc consistency (WAC) is
expected to be beneficial for communication. More complex forms of soft arc consistency remain to be investigated [2], [4]. The use of soft arc consistency in a distributed
context requires a careful analysis, to avoid that the extra communication effort needed
to enforce them surpasses the expected benefits in search space reduction. Preliminary
experimental results indicates improvement in the performance with respect to the BnBADOPT algorithm.
References
1. Dechter R., Constraint Processing, Morgan Kaufmann, 2003.
2. de Givry S., Heras F., Larrosa J., Zytnicki M. Existential arc consistency: getting closer to
full arc consistency in weighted CSPs Proc. of IJCAI-05, 2005.
3. Larrosa J. Node and arc consistency in weighted CSP Proc. of AAAI-02, 2002.
4. Larrosa J., Schiex T. In the quest of the best form of local consistency for Weighted CSP
Proc. of IJCAI-03, 2003.
5. Modi P. J., Shen W.M., Tambe M., Yokoo M. Adopt: asynchronous distributed constraint
optimization with quality guarantees. Artificial Intelligence, 161, 149–180, 2005.
6. Petcu A., Faltings B. A scalable method for multiagent constraint optimization Proc. of
IJCAI-05, 266–271, 2005.
7. Silaghi M. Yokoo M. Nogood-based Asynchronous Distributed Optimization (ADOPT-ng).
Proc. of AAMAS-06, 2006.
8. Yeoh W., Felner A. Koenig S. BnB-ADOPT: An Asynchronous Branch-and-Bound DCOP
Algorithm Proc. of AAMAS-08, 591–598, 2008.
37
An automaton Constraint for Local Search
Jun He (student), Pierre Flener, and Justin Pearson
Department of Information Technology
Uppsala University, Box 337, SE – 751 05 Uppsala, Sweden
Firstname.Lastname@it.uu.se
Abstract In this paper we explore the idea of using deterministic finite
automata (DFAs) to implement constraints for local search (this is already a successful technique in global search). We show how it is possible
to automatically incrementally maintain violation counts from a specification of a DFA. We show the practicality of the idea on a real-life
scheduling example.
1
Introduction
When a high-level constraint programming (CP) language lacks a (possibly
global) constraint that would allow the formulation of a particular model of a
combinatorial problem, then the modeller traditionally has the choice of (1) switching to another CP language that has all the required constraints, (2) formulating
a different model that does not require any lacking constraints, or (3) implementing the lacking constraint in the low-level implementation language of the
chosen CP language. This paper addresses the core question of facilitating the
third option, and as a side effect makes the first option unnecessary.
The user-level extensibility of CP languages has been an important goal
for over a decade. In the traditional global search approach to CP (namely
heuristic-based tree search interleaved with propagation), higher-level abstractions for describing new constraints include indexicals [15], (possibly enriched)
deterministic finite automata (DFAs) via the automaton [2] and regular [10]
generic constraints, and multi-valued decision diagrams (MDDs) via the mdd
[5] generic constraint. In the more recent local search approach to CP (called
constraint-based local search, CBLS, in [12]), higher-level abstractions for describing new constraints include invariants [8], a subset of first-order logic with
arithmetic via combinators [14] and differentiable invariants [13]; and existential
monadic second-order logic for constraints on set variables [1]. In the former approach, a generic but efficient propagation algorithm achieves a suitable level of
local consistency by processing the higher-level description of the new constraint,
while in the latter approach, a generic but incremental violation maintenance
algorithm is proposed.
In this paper, we revisit the description of new constraints via DFAs, already
successfully tried within the global search approach to CP, and show that it can
also be successfully used within the local search approach. The rest of this paper
is organised as follows. In Section 2, we present our algorithm for incrementally
38
1
2 d
d
x
5
d x
x
3
d
x
e
6
x e e
e
4
Figure 1. An automaton for a simple work scheduling constraint
maintaining both the violation degree of a constraint described by a DFA, and
the violation degrees of each decision variable of that constraint. In Section 3, we
present experimental results establishing the practicality of our results. Finally,
in Section 4, we summarise this work and discuss related as well as future work.
2
Incremental Violation Maintenance with DFAs
Our method for implementing a constraint described by a DFA has two major
components: a pre-processing to set up useful data structures, including the
computation of the violations of the constraint and each of its decision variables
for the initial assignment, and the incremental maintenance of these violations
when an actual move is made.
Our running example is the following, for a simple work scheduling constraint.
There are values for two work shifts, day (d) and evening (e), as well as a value for
enjoying a day off (x). Work shifts are subject to the following three conditions:
one must take at least one day off before a change of work shift; one cannot work
for more than two days in a row; and one cannot have more than two days off in
a row. A DFA for this constraint is given in Figure 1. The start state 1 is marked
by a transition entering from nowhere, and the success states 5, 6 are marked
by double circles. Missing transitions, say from state 2 upon reading a value e,
are assumed to go to an implicit failure state, with a self-looping transition for
every value (so that no success state is reachable from it).
The pre-processing is similar to the pre-processing in [10], namely unrolling
the DFA for a given length n of the sequence V = V1 , . . . , Vn of decision variables.
Definition 1 (Layered Graph). Given a DFA with m states, the layered graph
over n variables is a graph with m · (n + 1) nodes. Each of the n + 1 vertical
layers has a node for each of the m DFA states. The node for the start state of
the DFA in the layer 1 is labelled as the start node. There is an arc labelled w
from node f in layer i to node t in layer i + 1 if and only if there is a transition
labelled w from f to t in the DFA. A node in layer n + 1 is labelled as a success
node if it corresponds to a success state in the DFA.
The layered graph is further processed by removing all nodes and paths that
do not lead to a success node. Next, the resulting graph, seen as a DFA (or
ordered MDD), is minimised (or reduced)[6], as the size of the graph influences
the run time of algorithms. For instance, the unrolled version for n = 6 decision
2
Layer 1
Layer 2
12
2 x
d
42
1
x
e
d
d
18
e
3
x
x
4 e
12
Layer 3
6
2 x
d
d
3 e
8 x
x
4
6xe
5 d
4
e
6
6
Layer 4
3
2 d
x
4 d
3
e
3x
4 e
x
2
x
5
d
6
2
e
Layer 5
Layer 6
1
2
1
2
d
x
1
3
d
3
2
Layer
39 7
e
1
4
x
1
5
e
Figure 2. The unrolled automaton of Figure 1
variables of the DFA in Figure 1 is given in Figure 2 (which is best viewed in
colour). The purple number by each node is the number of paths from that node
to a success node in the last layer.
Next, we compute the violations of the automaton constraint and each of its
decision variables for the initial assignment. Toward this, we show how an assignment is transformed into a sequence of disconnected segments in the layered
graph G, where each segment is a connected sequence of arcs in G.
First, if the current assignment is a solution, then there exists a path P in
the layered graph from the start node in the first layer through arcs labelled by
V1 , V2 , . . . , Vn and finally to a success node in the last layer. In this case, we get
only one segment, which is P . Second, if the constraint is unsatisfied, then there is
some index i such that from the start node in the first layer through arcs labelled
with V1 , V2 , . . . , Vi−1 and finally to a node Si in layer i that has no out-going arcs
labelled with value Vi . The sequence of arcs labelled with V1 , V2 , . . . , Vi−1 is the
first segment. From the node Si in layer i, we select one of Si ’s successor nodes
and start the next segment. Ideally we want to select a node that corrects the
path in the best possible way. This is done by selecting the next node randomly
from the set of nodes in the next layer that are reachable from the current node
according to a distribution where nodes are weighted according to the number
of paths from the current node to a success node in the last layer. Repeating
this procedure until we arrive at the last layer, we will finally get several disjoint
segments. For instance, in Figure 2, with the assignment V = hx, e, d, e, x, di,
the first segment will be hx, ei. Next, V3 is a violated variable because there is
no arc labelled with d that connects node 4 in layer 3 with any nodes in layer 4.
Node 4 in layer 3 has two out-going arcs that are connected with nodes 3 and 5
in layer 4. In layer 4, there are 4 paths from node 3 compared to 2 paths from
4
node 5, so node 3 is chosen with probability 4+2
, and we assume it is chosen.
From node 3 in layer 4, we will get the second segment he, x, di, which stops at
success node 5 in the last layer.
3
40
Definition 2 (Violation of an Automaton Constraint and its Variables).
Given an assignment yielding the segments σ1 , . . . , σℓ of the sequence of n decision variables Vi of an automaton constraint c:
Pℓ
– The violation Violation of c is n − i=1 |σi |.
– The violation Violation(Vi ) of decision variable Vi of c is 0 if there exists a
segment index j in 1, . . . , ℓ such that i ∈ σj , and 1 otherwise.
It can easily be seen that the violation of an automaton constraint is the sum
of the violations of its decision variables, and that it is never an underestimate of
the minimal Hamming distance between the current assignment and any solution
assignment.
Local search proceeds from a giving assignment by checking a number of
neighbours of that assignment and picking a neighbour that ideally reduces the
violation: the exact heuristics are often problem dependent. But in order to
make local search computationally efficient, the violations of the constraint and
its variables have to be computed in an incremental fashion whenever a variable
changes value. When a variable changes value only segments after that variable
need to be changed. Initialisation and each incremental update only take a linear
amount (in n) of work (algorithm and proof omitted for space reasons).
3
Experiments
Many industries and services need to function around the clock. Rotating schedules are a popular way of guaranteeing a maximum of equity to the involved work
teams (see [7]). In our example in Figure 3(a), there are day (d), evening (e),
and night (n) shifts of work, as well as days off (x). Each team works maximum
one shift per day. The scheduling horizon has as many weeks as there are teams.
In the first week, team i is assigned to the schedule in row i. For any next week,
each team moves down to the next row, while the team on the last row moves up
to the first row. The daily workload may be uniform: in Figure 3(a) each day has
exactly one team on-duty for each work shift, and hence two teams entirely offduty; assuming the work shifts average 8h, each employee will work 7·3·8 = 168h
over the five-week-cycle, or 33.6h per week. Daily workload, whether uniform or
not, can be enforced by global cardinality (gcc) constraints on the columns.
Further, any number of consecutive workdays must be between two and seven,
and any change in work shift can only occur after two to seven days off. This
can be enforced by a pattern(X, {(d, x), (e, x), (n, x), (x, d), (x, e), (x, n)}) constraint [4] and a circular stretch(X, [d, e, n, x], [2, 2, 2, 2], [7, 7, 7, 7]) constraint [9]
on the table flattened row-wise into a sequence X. Our model posts the two
required pattern and stretch constraints on the row-wise flattened matrix of decision variables, by actually using the conjunction automaton of the pattern and
stretch DFAs, as this gives better run times than the decomposition. The gcc
constraints on the columns of the matrix are kept invariant: the first assignment
is chosen so as to satisfy them, and then only swap moves (between distinct
values) inside a column are considered. As a meta-heuristic, we use tabu search
4
41
1
2
3
4
5
Mon Tue Wed Thu Fri Sat Sun
x
x
x
d d d d
x
x
e
e e x x
d
d
d
x x e e
e
e
x
x n n n
n
n
n
n x x x
(a) Classical rotating schedule
instance total time opt time iterations
1d, 1e, 1n, 1x
18
2
14
1d, 1e, 1n, 2x
26
2
7
2d, 1e, 1n, 2x
26
2
9
2d, 2e, 1n, 2x
33
5
17
2d, 2e, 2n, 2x
42
6
20
2d, 2e, 2n, 3x
43
7
21
3d, 2e, 2n, 3x
52
8
25
(b) Experimental results on rotating schedules
Figure 3. (a) A five-week rotating schedule with uniform workload. (b) Average run
times (total and optimisation, in milliseconds) and numbers of iterations to the first
solutions of rotating schedules
with a restarting mechanism. The search has been implemented in Comet using the violations from the algorithm to guide the search by selecting violated
variables at each iteration to be changed. In addition to the classical instance in
Figure 3(a), here denoted 1d, 1e, 1n, 2x, we ran experiments over other instances
with uniform daily workload, namely those over 4 to 10 weeks where the weekly
workload was between 33h and 42h. Figure 3(b) gives the average run times (in
milliseconds) and numbers of iterations over 50 runs to the first solution of each
of these instances. All runs were made under Comet (revision 2.0 Beta) on an
Intel 2.4 GHz Linux machine with 512 MB memory. Although much more experimentation is required these initial results show that even on instances with
70 decision variables (3d, 2e, 2, 3x) it is possible to find solutions very quickly.
4
Conclusion
In summary, we have shown that the idea of describing novel constraints by
automata can be successfully imported from classical (global search) constraint
programming to constraint-based local search.
The only related work we are (now) aware of is a Comet implementation [11]
of the regular constraint [10], based on the ideas for the propagator of the soft
regular constraint [16]. The difference is that they estimate the violation change
compared to the nearest solution (in terms of Hamming distance from the current
assignment), whereas we estimate it compared to one randomly picked solution.
Experiments are needed to compare the two approaches.
We do not claim that automata boost the expressive power of Comet, which
already has a very powerful means of achieving extensibility of its modelling
language, namely differentiable invariants [13]. Indeed, re-consider the unrolled
automaton in Figure 2: its paths from source to success nodes correspond to an
extensional definition of the constraint, and can be read off as the nested and/or
formula (V1 = d∧((V2 = x∧. . . )∨(V2 = d∧. . . )))∨(V1 = x∧. . . )∨(V1 = e∧. . . ),
which can be posted as a differentiable invariant (and is more compact than the
disjunctive normal form obtained by the disjunction over all such paths). Our
first experiments (omitted here for space reasons) indicate that our incremental
5
42
algorithms for automata exploit the problem structure better than those for
differentiable invariants, but a deeper comparison is needed.
Future work includes a variety of extensions: it is possible to extend the
implementation to handle non-deterministic finite automata, which is interesting
since they are often smaller than the equivalent DFAs; further incorporating
reification should be investigated, and finally extensions such as the counters
used in [2].
Acknowledgements. The authors are supported by grant 2007-6445 of the
Swedish Research Council (VR), and Jun He is also supported by grant 2008611010 of China Scholarship Council and the National University of Defence
Technology of China. Many thanks to Magnus Ågren (SICS) for some useful
discussions on this work, and to the anonymous referees, especially for pointing
out the existence of [11,16].
References
1. M. Ågren, P. Flener, and J. Pearson. Generic incremental algorithms for local
search. Constraints, 12(3):293–324, September 2007.
2. N. Beldiceanu, M. Carlsson, and T. Petit. Deriving filtering algorithms from constraint checkers. Proc. of CP’04, LNCS 3258:107–122. Springer, 2004.
3. N. Beldiceanu, M. Carlsson, and J.-X. Rampon. Global constraint catalogue: Past,
present, and future. Constraints, 12(1):21–62, 2007. Dynamic on-line version at
www.emn.fr/x-info/sdemasse/gccat.
4. S. Bourdais, P. Galinier, and G. Pesant. HIBISCUS: A CP application to staff
scheduling in health care. Proc. of CP’03, LNCS 2833:153–167. Springer, 2003.
5. K.C.K. Cheng and R.H.C. Yap. Maintaining generalized arc consistency on ad hoc
r-ary constraints. Proc. of CP’08, LNCS 5202:509–523. Springer, 2008.
6. M.Z. Lagerkvist. Techniques for Efficient Constraint Propagation. Licentiate Thesis, KTH – The Royal Institute of Technology, Stockholm, Sweden, November 2008.
7. G. Laporte. The art and science of designing rotating schedules. Journal of the
Operational Research Society, 50(10):1011–1017, October 1999.
8. L. Michel and P. Van Hentenryck. Localizer: A modeling language for local search.
Proc. of CP’97, LNCS 1330:237–251. Springer, 1997.
9. G. Pesant. A filtering algorithm for the stretch constraint. Proc. of CP’01, LNCS
2239:183–195. Springer, 2001.
10. G. Pesant. A regular language membership constraint for finite sequences of variables. Proc. of CP’04, LNCS 3258:482–495. Springer, 2004.
11. B. Pralong. Implémentation de la contrainte Regular en Comet. Master Thesis,
École Polytechnique de Montréal, Canada, 2007.
12. P. Van Hentenryck and L. Michel. Constraint-Based Local Search. MIT Press, 2005.
13. P. Van Hentenryck and L. Michel. Differentiable invariants. Proc. of CP’06, LNCS
4204:604–619. Springer, 2006.
14. P. Van Hentenryck, L. Michel, and L. Liu. Constraint-based combinators for local
search. Proc. of CP’04, LNCS 3258:47–61. Springer, 2004.
15. P. Van Hentenryck, V. Saraswat, and Y. Deville. Design, implementation, and
evaluation of the constraint language cc(FD). Journal of Logic Programming 37(1–
3):293–316, 1998. Unpublished manuscript Constraint Processing in cc(FD), 1991.
16. W.-J. van Hoeve, G. Pesant, and L.-M. Rousseau. On global warming: Flow-based
soft global constraints. Journal of Heuristics 12(4-5):347-373, September 2006.
6
43
Research Overview: Improved Boolean Satisfiability
Techniques for Haplotype Inference
Student: Eric I. Hsu
Supervisor: Sheila A. McIlraith
Department of Computer Science
University of Toronto
{eihsu,sheila}@cs.toronto.edu,
http://www.cs.toronto.edu
Abstract. Here the authors overview an ongoing effort to extend satisfiabilitybased methods for haplotype inference by pure parsimony (HIPP). This genome
analysis task, first formulated as a boolean satisfiability problem by Lynce and
Marques-Silva [1], has been performed successfully by modern SAT-solvers. But,
it is not as widely used as some better-publicized statistical tools, such as PHASE
[2]. This paper presents the authors’ assessment of the current situation, and a
preliminary statement of intention concerning their aims in this area. Namely,
the situation suggests three categories of improvements for making HIPP more
widely-used within the biological community: 1) the ability to handle larger problems; 2) more detailed empirical understanding of the accuracy of the “pure parsimony” criterion; and 3) additional criteria and methods for improving on this
level of accuracy.
1
Background
As detailed in a recent overview paper [3], the haplotype inference problem is defined
over a population of individuals represented by their respective genotypes. Each genotype can be viewed as a sequence of nucleotide pairs, where the two values of each pair
are split across the individual’s two chromosomes as inherited from their two parents.
Much of the genetic variation between individuals consists of point mutations at known
sites within this sequence, known as single nucleotide polymorphisms (SNP’s). Thus, a
genotype can represented (with loss of information) as a sequence of nucleotide pairs
at successive SNP sites.
In particular, at each such site, an individual might have two copies of the “minor allele”– in this case, the presumably mutated or at least rarer of the two possible
nucleotide values. At this site the individual is then homozygous (major). Similarly, a
SNP site is realized on each chromosome by the nucleotide value that is most common for the species then the site is homozygous (minor) for this individual. The third
possibility is that one of the individual’s chromosomes has the major allele at a given
site, while the other chromosome has the minor allele at that site–then the genotype is
heterozygous at the site in question. Notationally, the first case can be represented by
the character ‘0’, the second by ‘1’, and the third, heterozygous case, by ‘2’. So, the
sequence “0102” indicates an individual with two minor alleles at both the first and the
2
Improved Techniques for Haplotype Inference
44
third SNP sites measured by a particular study, and two major alleles at the second site.
At the fourth site, we know that one chromosome exhibits the major allele and the other
exhibits the minor. Thus, an individual’s genotype is well-defined by the sequences of
its two constituent chromosomes; these two individually inherited sequences are called
haplotypes. If we overload ‘0’ and ‘1’ to indicate a single minor or single major allele
in a particular haplotype, then we can denote that the genotype “0102” arises from the
two haplotypes “0100” and “0101”.
Conversely, though, a genotype with more than one heterozygous site can be explained by multiple pairs of haplotypes, as the major and minor alleles at all heterozygous sites in the genotype can be permuted across corresponding sites in the
two haplotypes. More concretely, the genotype “02122” could be realized the pairs
“00100”/“01111”, “00101”/“01110”, “00110”/“01101”, or “01100”/“00111”. (Naturally, the term “pair” is used colloquially here, and does not signify any sense of ordering within the haplotype sets of size two that arise in this domain.) Accordingly, there
are 2k−1 candidate haplotype pairs for explaining a genotype with k heterozygous sites.
In practice, this distinction is made relevant by the lack of any practical experimental
method to measure an individual’s two haplotypes instead of its genotype; in other
words, the machinery can identify sites at which the two haplotypes have opposing
values, but cannot tell which values are grouped together on which chromosome. The
goal of haplotype inference is to guess the most likely haplotypes that generated a given
set of genotypes.
2
Underlying Biological Principles
How can one answer to the haplotype inference problem be preferred to any other?
Because the ultimate goal is to accurately predict haplotypes appearing in a particular
subject (i.e. human) population, haplotype inference frameworks must apply standards
that seek to model the types of phenomena that actually drove the true state of affairs in
the evolution of the subject species’ genome. In other words, human haplotypes are not
drawn uniformly from the space of all possible pairs that could explain human genotypes. Rather, under the coalescent model of evolution there should be only a small
number human haplotypes that were recombined and mutated to produce any of the
genotypes within a given haplotype inference problem instance. Thus, early researchers
used greedy methods to try to minimize the set of answers to a haplotype inference
problem [4], while later systems integrated models based on “perfect phylogeny” [5] or
other hierarchical organizations of answer haplotypes [6]. The most widely-used techniques at this point in time integrate empirical statistical measures of likely haplotypes
[7–9, 2, 10], by analyzing the population in question, or consulting outside sources [11].
Such approaches may not require that the inferred set of answer haplotypes can be arranged into a particular evolutionary structure, but they all require the haplotype set to
be maximally likely according to a particular statistical model that has been fit to the
problem and/or outside frequency data.
On the other hand, the “pure parsimony” methodology seeks to capture such phenomena implicitly by asking directly for the smallest possible set of haplotypes that as
a whole can explain a given population of genotypes [12]. Methods that are based on
Improved Techniques for Haplotype Inference
45
3
this criterion thus perform “HIPP”, indicating haplotype inference by pure parsimony.
This pure parsimony principle has been achieved optimally and efficiently by employing a variety of satisfiability-based techniques on small to moderately-sized data sets of
about 200 sites and 100 individuals [1, 13–15]. However, aside from a preliminary and
limited evaluation of pure parsimony, which did not include such satisfiability-based
techniques [16], the overall accuracy and general feasibility HIPP does not seem to be
well-understood within the biological community.
That is, satisfiability-based HIPP methods must demonstrate two forms of feasibility in order to achieve wider adoption within the biological community: the empirical
accuracy of the pure parsimony principle itself, and the efficiency and scalability of
SAT methods for achieving this parsimony criterion. While SAT-based techniques can
achieve optimal parsimony, on reasonably large data sets, they still cannot be applied to
the massive collections characteristic of more popular biological applications that may
require on the order of half a million sites. Addressing the issue of scalability will not
only enable the assessment of model accuracy and solver efficiency, but can additionally lead to more accurate results. This is because multiple minimum haplotype sets can
explain the same population of genotypes [3], but some of them can be safely considered more likely a priori. For instance, solution sets whose members are more similar
to each other are more realistic with respect to evolutionary theory [2, 10], and certainly
one must certainly be preferred a posteriori with respect to the truth–in the real world
there was a single haplotype set that produced a set of genotypes (and this set may or
not be minimum.) So while the initial goal is evaluation of the HIPP principle and HIPP
solvers, and the prerequisite goal is improved solver scalability, in pursuing these we
would like to synergistically attain the overarching goal of making the most accurate
predictions possible, as opposed to merely minimal ones.
We propose to do this by making finer-grained use of biological principles in designing SAT-based methodologies for HIPP. One prominent example of such phenomena would be “linkage disequilibrium”, or correlation between sites within a sequence
[17]. The sequential and mostly non-random nature of genotypes and haplotypes are
at core of many of the optimizations and models that underly statistical alternatives to
HIPP; finding a way to exploit them within a discrete reasoning framework would make
HIPP competitive in efficiency and accuracy. Three proposals for doing so are outlined
in the next section.
3
Proposals for Extending the Satisfiability-Based HIPP
Framework
In this section we propose three general types of improvements to the HIPP framework
that can make it competitive to the current methods of choice within the biological
community.
– Exploiting sequential structure to improve scalability. The best-performing (and
most widely-used) statistical systems [7, 8, 2] for haplotype inference utilize explicit or implicit variants of the “partition ligation” scheme of Niu et al [18]. The
basic idea is to escape the combinatorial explosion of considering the entire space
4
Improved Techniques for Haplotype Inference
46
of possible haplotypes for a given sequence, and instead break the sequence into
blocks. Each such block is small enough to be solved efficiently and to high accuracy; they are then recombined heuristically through a polynomial-time merging
scheme. With the merging scheme comes a loss of optimality; in the case of HIPP
we may not get the smallest possible explaining haplotype set by means of this
scheme. But, the insight of the partition ligation scheme is that evolution does not
create haplotypes uniformly at random over all possible explanations for the human
genotype, and in practice the loss in optimality has proved negligible in comparison
to the gains in accuracy for statistical methods [18, 7, 2]. For beyond being biologically justifiable, block partitioning can actually produce more accurate overall results because each block can be solved to higher standards of likelihood using more
computationally expensive methods. For instance, a statistical approach based on
MCMC sampling can choose to perform vastly more iterations on a more complex model within the confines of a small block, while realistically hoping to retain
much of the benefit during the merging process [2]. Applying this framework to
SAT-based approaches can provide similar gains, whether using parsimony alone
or integrating statistical information in solving blocks.
– Assess the accuracy of HIPP on large-scale data. When HIPP is able to handle larger data sets, it will be possible to directly compare its accuracy with other,
more-popular approaches. This will allow an assessment of the model’s strengths
and weaknesses and inform any attempts to improve this accuracy. At this point,
SAT methods have been highly successful at solving HIPP, but it remains to be
seen whether the resulting answers are themselves successful at modeling the human genome–as mentioned previously here and elsewhere [3], there can be many
minimum sets that all qualify as HIPP solutions, while some of these are much
more empirically likely than others. Characterizing which types of these solutions
usually turn out to be more accurate will go a long way to improving the parsimony
model’s fit to real populations.
– Exploiting linkage disequilibrium to improve accuracy. In the same spirit as
the first two points, there are important correlations between various regions of the
vast majority of a species’ haplotypes, due to the recombination and mutation processes that drive evolution. To compete with statistically-oriented tools, HIPP can
be extended to encompass the same sort of empirical information concerning such
correlations [19, 20]. This may entail a weighted SAT or a MAXSAT formulation
that favors solutions that adhere to stronger correlations that have been observed
from the same sources of data as used by the competitor techniques.
4
Conclusion
The authors have begun to implement block decomposition within the SAT-based HIPP
framework, but this description of research is decidedly preliminary and strictly for
expository purposes. Once the system is able to handle large problems, the next step
will be to study its accuracy, especially in terms of adding additional solution criteria to
pure parsimony. The concluding step will be to use information derive from individual
problem-instance and/or reference data to actually achieve such criteria. It would be a
Improved Techniques for Haplotype Inference
47
5
great opportunity to be able to discuss such plans with others who are working on this
problem and related areas!
References
1. Lynce, I., Marques-Silva, J.: Efficient haplotype inference with boolean satisfiability. In:
Proc. of 21st National Conference on Artificial Intelligence (AAAI ’06), Boston, MA. (2006)
2. Stephens, M., Scheet, P.: Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. American Journal of Human Genetics 76 (2005) 449–462
3. Lynce, I., Graça, A., Marques-Silva, J., Oliveira, A.L.: Haplotype inference with boolean
constraint solving: An overview. In: Proc. of 20th IEEE Int’l Conf. on Tools with Artificial
Intelligence (ICTAI ’08), Dayton, OH. (2008)
4. Clark, A.G.: Inference of haplotypes from PCR-amplified samples of diploid populations.
Molecular Biology and Evolution 7(2) (1990) 111–122
5. Kimmel, G., Shamir, R.: GERBIL: Genotype resolution and block identification using likelihood. PNAS 102(1) (2005) 158–162
6. Xing, E.P., Sohn, K.A., Jordan, M.I., Teh, Y.W.: Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture. (2006)
7. Marchini, J., Cutler, D., Patterson, N., Stephens, M., Eskin, E., Halperin, E., Lin, S., Qin,
Z.S., Munro, H.M., Abecasis, G.R., Donnelly, P.: A comparison of phasing algorithms for
trios and unrelated individuals. American Journal of Human Genetics 78 (2006) 437–450
8. Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide
association studies. Human Genetics 124 (2008) 439–450
9. Salem, R.M., Wessel, J., Schork, N.J.: A comprehensive literature review of haplotyping
software and methods for use with unrelated individuals. Human Genomics 2
10. Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population
genotype data: Applications to inferring missing genotypes and haplotypic phase. American
Journal of Human Genetics 78 (2006) 629–644
11. The International HapMap Consortium: A second generation human haplotype map of over
3.1 million SNPs. Nature 449 (2007) 851–862
12. Gusfield, D.: Haplotype inference by pure parsimony. In: Proc. of 14th Symp. on Combinatorial Pattern Matching (CPM ’03), Morelia, Mexico. (2003) 144–155
13. Graça, A., Marques-Silva, J., Lynce, I., Oliveira, A.L.: Efficient haplotype inference with
pseudo-boolean optimization. In: Proc. of 2nd Int’l Conf. on Algebraic Biology (AB ’07),
Linz, Austria. (2007) 125–139
14. Erdem, E., Türe, F.: Efficient haplotype inference with answer set programming. In: Proc.
of 23rd National Conference on A.I. (AAAI ’08), Chicago, IL. (2008) 436–441
15. Lynce, I., Marques-Silva, J., Prestwich, S.: Boosting haplotype inference with local search.
Constraints 13(1-2) (2008) 155–179
16. Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14)
(2003) 1773–1780
17. Slatkin, M.: Linkage disequilibrium–understanding the evolutionary past and mapping the
medical future. Nature Reviews Genetics 9 (2008) 477–485
18. Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotype inference for multiple linked single
nucleotide polymorphisms. American Journal of Human Genetics 70(1) (2002) 157–169
19. Excoffier, L., Slatkin, M.: Maximum-Likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution 12(5) (1995) 921–927
20. Kuhner, M.K.: LAMARC 2.0: Maximum likelihood and Bayesian estimation of molecular
haplotype frequencies in a diploid population. Bioinformatics 22(6) (2006) 768–770
48
Foundations of Symmetry Breaking Revisited
Tim Januschowski (student),1,2⋆ Barbara M. Smith,3 and M.R.C. van Dongen2
1
2
Cork Constraint Computation Centre
Computer Science Department, University College Cork, Cork, Ireland
3
School of Computing, University of Leeds, UK
janus@cs.ucc.ie
Abstract. Puget’s article “On the Satisfiability of Symmetrical Constraint Satisfaction Problems” [6] is widely acknowledged to be one of
the founding papers in symmetry breaking (see for example [3, 7]). To
date, Puget’s definition of a valid reduction seems to be the only common
ground that all authors on static symmetry breaking constraints agree
on. His original definition of a valid reduction is for a restricted form
of symmetries. We extend Puget’s definition to the recent, more general
definition of symmetries by Cohen et al. [1]. In our extension, we require
a valid reduction to be a tighter constraint satisfaction problem instead
of one with more constraints. We re-formulate Puget’s central theorems
on valid reductions for our definition and present a stronger result on
the existence of valid reductions.
1
Introduction
Symmetries are a common property of Constraint Satisfaction Problems (csps).
In practice, symmetries often are a key to an effective solution of symmetric
csps: only if we properly exploit the symmetries may we solve such csps within a
reasonable time. The classic approach to dealing with symmetries is the addition
of traditionally called symmetry breaking constraints before search. This is the
most commonly used symmetry breaking approach in practice and it receives
considerable attention. We consider the theoretical foundations of the addition
of symmetry breaking constraints in the present work following Puget [6].
Puget introduces his symmetry breaking constraint [6] as a constraint whose
addition to a given csp leads to a related csp that he calls a valid reduction. Valid
reductions are defined without dependence on any specific symmetry breaking
constraint however. This is why they seem to be the only common ground that
authors on symmetry breaking constraints agree on, e.g. [2, 5]. These constraints
differ in so many aspects, that valid reductions may offer the only framework to
theoretically compare the different symmetry breaking constraints. Furthermore,
valid reductions offer a means to study the theoretical effects of the addition
of symmetry breaking constraints. This is the reason why an investigation of
valid reductions is important for the current research on symmetry breaking
⋆
Tim Januschowski is supported by the EMBARK initiative by the Irish Research
Council for Science, Engineering and Technology
49
constraints. We see valid reductions as a first step towards a general theory of
static symmetry breaking.
We informally revisit Puget’s definition of a valid reduction for a restricted
form of symmetry. However, the recent definition of symmetries by Cohen et al.
[1] is much wider. We carry Puget’s definition of a valid reduction over to this
definition of symmetry and require a valid reduction to be a tighter csp than
the original csp. Next we provide a simple example that shows that adding
Puget’s constraints may not change the csp. This underlines the importance of
our change of emphasis in the definition. We present a valid reduction that has
more symmetries than the original csp. We identify the cases in which we can
find a valid reduction that differs from the original csp and we show stronger
versions of Puget’s central theorems. Due to space restrictions, we have omitted
all proofs. We show an example of a symmetric csp, whose symmetries cannot
be eliminated.
2
Preliminary Definitions
We define a csp P as the triple (X, D, C), where X is the set of variables of P ,
where D(x) is the domain of x ∈ X, and where C is the set of constraints of P .
Since we are interested in theoretical results, we mainly consider constraints in
extensional form. We say that the tuples in the relation of c are allowed by c.
All other tuples are forbidden by c. We also call a forbidden tuple a no-good. For
ease of presentation, we want to exclude csps that have variables with empty
domains. We further assume that in the domain of a variable no values exist
that are forbidden by a unary constraint. We call a (variable,value)-assignment
a literal . A solution of P is an assignment of values to all variables that is allowed
by all constraints. If a solution exists, we say that P is satisfiable, otherwise P
is unsatisfiable.
A hypergraph G is a tuple (V, E) with V or V (G) the set of nodes and
E or E(G) the set of hyperedges. For V̂ ⊆ V , we denote the subgraph of
G that is induced by V̂ by G[V̂ ]. The microstructure S
complement ( msc) of
a csp (X, D, C) is the hypergraph G, where V (G) = x∈X {x} × D(x) and
{(x1 , t1 ), (x2 , t2 ), . . . , (xk , tk )} is in E(G) if and only if:
– a constraint in C with scope {x1 , x2 , . . . , xk }, k > 1 exists that forbids the
tuple ht1 , . . . , tk i
– k = 2, x1 = x2 and t1 6= t2 .
For binary csps, this graph is the complement of the microstructure [4]. We
want to define the msc as a loop-free hypergraph, which explains why k > 1.
More specifically, we assume that literals forbidden by unary constraints do not
appear as nodes in the msc.
A stable set in a hypergraph G is a set of nodes, V̂ , such that G[V̂ ] does not
contain any hyperedge. A csp with n variables is satisfiable if and only if its
msc contains a stable set with n nodes. A graph automorphism of hypergraph
G is a permutation of V (G) that preserves adjacency. We denote the group of
50
automorphisms of G by aut(G). The orbit of a set of nodes V̂ in V (G) is the
set of all images of V̂ under elements of aut(G). We denote it by orbit(V̂ ). From
now on, we shall write graph instead of hypergraph. The following definition, by
Cohen et al. [1], is the current state-of-the-art for symmetry definitions.
Definition 1 (Constraint Symmetry [1]). Let G be a msc of a csp P . A
symmetry or constraint symmetry of P is an element of aut(G).
Puget [6] considers a strict subclass of the symmetries in Def. 1. Having defined
what the symmetries of a csp are, we proceed by defining a symmetric csp.
Clearly, the identity is an automorphism of any csp. It is natural to exclude csps
whose only automorphism is the identity from being called symmetric csps. For
technical reasons we also exclude csps the domains of which are all singletons.
Definition 2 (Symmetric CSP). A csp having at least one variable with
domain size strictly larger than 1 is symmetric if it admits more than one symmetry.
Note that an empty csp is not symmetric according to this definition. We now
have all ingredients to extend Puget’s definition of a valid reduction.
3
Valid Reductions
Symmetries partition the set of solutions into orbits. The addition of a symmetry
breaking constraint always produces a csp with at least one representative of
each orbit of solutions: an orbit representative. Puget defines a valid reduction
as a csp with more constraints than the original csp such that for every orbit of
solutions (under Puget’s restricted symmetries) at least one orbit representative
remains. The following is our definition.
Definition 3 (Valid Reduction). Let P = (X, D, C) be a csp with n variables
and msc G. Let P̃ = (X, D̃, C̃) be a csp with the same variables as P , ∅ =
6
D̃(x) ⊆ D(x) for all x ∈ X. We call P̃ a valid reduction of P if the msc G̃ of
P̃ fulfils the following conditions:
1. a) V (G̃) ⊆ V (G) and b) E(G[V (G̃)]) ⊆ E(G̃), and
2. for every maximum stable set S of size n in G, at least one element of
orbit(S, aut(G)) exists in G̃.
According to the definition, we can produce a valid reduction by adding hyperedges to the msc of the csp, which is equivalent to adding a constraint of arity
2 or higher to the csp and by removing nodes from the msc, which is equivalent
to adding a unary constraint.
Definition 3 is an extension of Puget’s original definition since we allow for
more general symmetries to define the orbits. We stress that we may add unary
constraints to a csp (Condition 1a). Puget implicitly allows unary constraints
in his definition, though “at the core of [his] method” is “adding ordering constraints” [6]. We disallow unary constrains that remove all literals corresponding
51
to a variable. Though this may reduce symmetries, it may make it too difficult to
relate a valid reduction to the original csp and to reconstruct solutions. Further
work will address this. In Def. 3, we substitute fewer stable sets (partial or full
solutions) in the msc for higher number of constraints. We ensure fewer stable
sets in our definition by requiring the msc of a valid reduction to either have
fewer nodes or having more edges on the set of nodes in the msc that the valid
reduction and the the original csp have in common (Condition 1b). We shall say
that a valid reduction with fewer stable sets in its msc than the original msc
is tighter. A higher number of constraints in Puget’s sense could mean a mere
repetition of an already present constraint.
Valid reductions are defined in terms of removing symmetric solutions rather
than eliminating symmetries. However, by removing symmetric solutions, we
may also eliminate symmetries. The existence of orbit representatives ensures
that we can reconstruct all solutions of the original csp by applying the symmetries to the orbit representatives in the original csp. Puget proved a weaker
version of the following theorem.
Theorem 4. Any csp P is satisfiable if and only if any valid reduction of P is
satisfiable.
Besides Theorem 4, Puget’s other central result is the existence of a valid reduction for symmetric csps. He proves that for a csp with a symmetry φ that
maps a variable x to a different variable y, by showing that adding the constraint x ≤ y always produces a valid reduction. The following example shows
that adding this constraint may not produce a tighter valid reduction.
Consider a csp P with 2 variables x and y, each with domain {1, 2} and
one extensional constraint that only allows {(x, 2), (y, 2)} and forbids all other
tuples. This csp has a symmetry (also in Puget’s restricted terminology), that
swaps the variables x and y. Adding x ≤ y does not change the msc, but
raises the number of constraints of P . Hence, producing a tighter valid reduction
with Puget’s constraints is not always possible. We believe that inventing the
symmetry breaking constraint x ≤ y is the more important achievement than
the statement of the existence theorem for valid reductions. Otherwise, Puget
could have proved the existence theorem also by introducing a constraint that is
a copy of an already existing constraint. Let us give another example that shows
that adding ordering constraints may result in a valid reductions that has more
symmetries than the original csp.
Example 5. Consider a csp with two variables x and y, each with domain {1, 2, 3}.
The constraint on x and y allows {(x, 1), (y, 3)}, {(x, 3), (y, 1)} and {(x, 3), (y, 3)}.
The microstructure of this csp with symmetry group of size 4 is depicted in
Fig. 1a. Consider symmetry ψ which swaps (x, 2) with (y, 2) and is the identity
on all other literals. Symmetry ψ cannot be eliminated through the addition
of an ordering constraint, e.g., y ≤ x. We show the microstructure of a valid
reduction in Fig. 1b. The symmetry group of the valid reduction has size 12.
Example 5 shows that adding ordering constraints may not always eliminate
symmetries. However, removing nodes would eliminate the symmetries of Ex. 5.
52
(x, 3)
(y, 3)
(x, 3)
(y, 3)
(x, 2)
(y, 2)
(x, 2)
(y, 2)
(x, 1)
(y, 1)
(x, 1)
(y, 1)
(a) The microstructure of Ex. 5.
(b) Adding y ≤ x to the csp in Ex. 5 produces
the above valid reduction with a larger symmetry group than the original csp.
Fig. 1: The microstructure of the csp in Ex. 5 and one of its valid reductions.
In Def. 3, we do not require a valid reduction to be strictly tighter. Any csp,
symmetric or not, is a valid reduction of itself. This generalises Puget’s original
existence theorem for valid reductions. When we reason about the existence of
valid reductions, we must therefore be interested in showing the existence of a
valid reduction that is strictly tighter than the original, symmetric csp. We call
such a valid reduction proper. The following theorem answers the question which
symmetric csps admit a proper valid reduction.
Theorem 6. A symmetric csp P has a proper valid reduction if and only if it
has a partial solution that is not part of a solution or it has a symmetry φ that
maps a solution S solution φ(S) such that S 6= φ(S). If P has such a symmetry,
then a proper valid reduction can be obtained by adding a unary constraint.
Theorem 6 singles out the symmetries that map a solution to a different solution
as the important ones for valid reductions. However, before starting to prove the
satisfiability of a csp, we may only know that symmetries exist, e.g., through
automatic detection in preprocessing. We cannot assume knowing whether the
symmetries map a solution to a different solution.
In Fig. 2, we depict the microstructure of a symmetric csp that does not
admit a proper valid reduction. Apart from the identity, only one automorphism
exists. This automorphism swaps the grey-coloured literals (z, 3) with (y, 2), and
is the identity on all other literals. Every partial solution is part of a solution
whose orbit consists only of itself. Hence, we cannot remove a node or an edge.
This means that no proper valid reduction of this csp exists. An example of an
non-symmetric csp without a proper valid reduction is the empty csp.
4
Conclusion
We have presented Puget’s fundamental work [6] on symmetry breaking in light
of the recent definition of symmetry [1]. We have re-formulated his central results
and strengthened them. Our results suggest that symmetry breaking based on
the idea of valid reductions may, in theory, only supply additional information
to a problem if symmetric solutions exist. This is somewhat disappointing, since
the existence of symmetric solutions can be decided only by solving the csp.
53
(x, 1)
(z, 1)
(x, 2)
(y, 1)
(z, 2)
(y, 2)
(z, 3)
Fig. 2: The microstructure of a symmetric csp which does not admit a proper valid
reduction. The only non-trivial symmetry of this csp only swaps the grey-coloured
nodes (z, 3) and (y, 2). We can neither remove a node nor remove an edge.
Valid reductions are among the requirements for a constraint to be considered as a symmetry breaking constraint. Hence, they offer a possibility to analyse
the theoretical power of symmetry breaking constraints. While valid reductions
allow for constraints that are far from our intuition about symmetry breaking
constraints, valid reductions can still give us an idea about the theoretical limits
of symmetry breaking constraints. In further work, we will consider valid reductions with further properties. We will analyse and compare the very diverse
symmetry breaking constraints using valid reductions.
References
1. D. Cohen, P. Jeavons, C. Jefferson, K. Petrie, and B. Smith. Symmetry definitions
for constraint satisfaction problems. Constraints 11, 2006.
2. J. Crawford, M. Ginsberg, E. Luks, and A. Roy. Symmetry-breaking predicates for
search problems. In KR’96: Principles of Knowledge Representation and Reasoning,
pages 148–159. Morgan Kaufmann, 1996.
3. P. Flener, J. Pearson, M. Sellmann, and P. V. Hentenryck. Static and dynamic
structural symmetry breaking. In Principles and Practice of Constraint Programming – CP 2006, volume 4204/2006 of Lecture Notes in Computer Science 4204,
2006, pages 695–699, 2006.
4. E. C. Freuder. Eliminating interchangeable values in constraint satisfaction problems. In Proceedings AAAI’91, pages 227–233, 1991.
5. V. Kaibel and M. E. Pfetsch. Packing and partitioning orbitopes. Mathematical
Programming, 114, Number 1 / July, 2008:1–36, 2008.
6. J.-F. Puget. On the satisfiability of symmetrical constrained satisfaction problems.
In Methodologies for Intelligent Systems, volume 689/1993 of Lecture Notes in Computer Science, pages 350–361, London, UK, 1993. Springer-Verlag.
7. F. Rossi, P. van Beek, and T. Walsh, editors. Handbook of Constraint Programming.
Elsevier, 2006.
54
Energetic Edge-Finder For Cumulative Resource
Roger KAMEUGNE1 (student) and Laure Pauline FOTSO2
1
2
University of Yaounde 1, Department of Mathematics, P.O Box: 812
Yaounde-Cameroon.
rkameugne@yahoo.fr or rkameugne@gmail.com
University of Yaounde 1, Department of Computer Science, P.O Box: 812
Yaounde-Cameroon.
lpfotso@ballstate.bsu.edu
Abstract. Edge-finding, not-first/not-last, sweeping and overloading checking are well known propagation rules used to prune the start and end
times of tasks which have to be processed without interruption on a cumulative resource. In this paper, we present new propagation rule title
energetic edge-finder who can make additional pruning. The new algorithm is organized in two phases. The first phase precompute the innermost maximization in the edge-finder specification and identify the
characteristics of the corresponding task interval use for. The second
phase based on energetic reasoning condition, uses precomputation to
apply the actual updates. The overall complexity of our algorithm is
O(n2 ) since the running time of each phase is O(n2 ).
1
Introduction
As it is suggested in [MV05], for a given pair (i, Θ) where i is a task and Θ a set
of tasks, if it is prove that in all feasible schedules task i ends after the completion time of all tasks of Θ, then the rest-based update of edge-finding is valid. In
this paper, we present a two phases algorithm hybridization of edge-finding and
energetic reasoning who can make additional pruning. The first phase precompute the innermost maximization in the edge-finder specification and identify the
characteristics of the corresponding task interval use for. It is based on our edgefinder present in [KF09b]. Indeed, in [KF09b], we propose an iterative O(n2 )implementation of the edge-finding who reached the same fix point as the rule
after many runs. This idea was already used for other rules [TL00,Vi04,KF09a].
The second phase based on energetic reasoning condition, uses precomputation
to apply the actual updates. The overall complexity of our algorithm is O(n2 )
since the running time of each phase is O(n2 ).
The rest of the paper is organized as follows. Section 2 presents notations
used in the paper. In section 3, we present on an example our motivations.
Section 4 specify the energetic edge-finder rule and a relaxation of this new rule.
Section 5 presents an O(n2 )-implementation of the relax energetic edge-finder
rule. Section 6 concludes the paper.
55
2
Notations
An instance of a CuSP consists of a set T of tasks to be performed on a resource
of capacity C. Each task i must be executed (without interruption) over pi time
units between an earliest start time ri (release date) and a latest end time di
(due date). Moreover, it requires a constant amount of resource ci . K = {ci , i ∈
T } denotes the set of distinct capacity requirements of tasks and k = |K|.
Throughout the paper, we assume that ri + pi ≤ di and ci ≤ C, otherwise the
problem has no solution. We also assume that all data are integer. A solution of
a CuSP is a schedule that assigns a starting date si to each task i so that:
1. ∀i ∈ T : ri P
≤ si ≤ si + pi ≤ di
2. ∀t :
ci ≤ C.
i∈T
si ≤ t < si + pi
ei = ci pi denotes the energy of task i. We extend the notation from tasks to set
of tasks by setting
X
rΩ = min rj ,
dΩ = max dj ,
eΩ =
ej
j∈Ω
j∈Ω
j∈Ω
where Ω is a set of tasks. By convention, when Ω is the empty set, rΩ =
+∞, dΩ = −∞ and eΩ = 0. CuSP is a sub-problem of the Resource Constrained Project Scheduling Problem (RCPSP), where precedence constraints
are relaxed and a single resource is considered at a time. The CuSP is a NP
complete problem [BL99].
Definition 1 (Task Intervals). Let i, k ∈ T (eventually the same task). The
task intervals Ωi,k is the set of tasks
Ωi,k = {j ∈ T | ri ≤ rj ∧ dj ≤ dk }
Let i be a task and [t1 , t2 ] be a time interval with t1 < t2 . The ”leftshift/right-shift” required energy consumption of i over [t1 , t2 ] noted WSh is
ci times the minimum of the three following durations.
• t2 − t1 , the length of the interval;
• p+
i (t1 ) = max(0, pi −max(0, t1 −ri )), the number of time units during which
i executes after time t1 if i is left-shifted, i.e., scheduled as soon as possible;
• p−
i (t2 ) = max(0, pi −max(0, di −t2 )), the number of time units during which
i executes before time t2 if i is right-shifted, i.e., scheduled as late as possible.
−
This leads to WSh (i, t1 , t2 ) = ci min(t2 − t1 , p+
i (t1 ), pi (t2 )). The overall required
energy
consumption
W
(t
,
t
)
over
an
interval
[t
,
t
Sh 1 2
1 2 ] is define as WSh (t1 , t2 ) =
P
WSh (i, t1 , t2 ). This required energy consumption was first define in [ELH,Lo].
i∈T
Other varieties of required energy consumption are available. We have for example the Fully Elastic required energy noted WF E and the Partial Elastic required
56
energy noted WP E [BL99,B02]. It is obvious that if there is a feasible schedule,
then
∀t1 , ∀t2 ≥ t1 , SSh (t1 , t2 ) = C(t2 − t1 ) − WSh (t1 , t2 ) > 0.
(1)
We present only algorithm to update start times because the algorithm for update the end times is symmetrical.
3
Motivation
For more visibility, let us analyze the example given in Table 1.
Example 1. Consider the CuSP instance of Table 1 where five tasks share a
resource of capacity C = 3.
Table 1. An instance of CuSP
tasks r d p c
a
b
c
d
e
1
3
3
3
5
10
6
6
5
7
4
1
1
1
2
1
2
2
3
1
This instance satisfy the necessary condition of existence of feasible scheduling give by relation (1). Using the propagation rule edge-finding, extended edgefinding, not-first/not-last, sweeping on this instance, no time bound is adjusted.
But the hybrid rule describe up allow the update the release date of task a to 4.
Indeed, ∆+ (a, 3, 6) = 1 and rest(Θ, ca ) = 1 with Θ = {b, c, d} , where
∆+ (a, t1 , t2 ) = WSh (t1 , t2 ) − WSh (a, t1 , t2 ) + ca · p+
a (t1 ) − C · (t2 − t1 )
and
rest (Θ, ca ) = eΘ − (C − ca ) (dΘ − rΘ ) .
Therefore, it follows that ra = rΘ + ⌈ c1a rest (Θ, ca )⌉ = 4 since da > dΘ = 6
and ∆+ (a, 3, 6) > 0.
4
Energetic edge-finder
In this section, after a specification of the energetic edge-finder rule, we focus it
relaxation on which we can provide a quadratic algorithm.
57
4.1
Specification of the energetic edge-finder algorithm.
As for the energetic reasoning, the energetic edge-finder require a CuSP instance
satisfying the necessary condition of existence of feasible scheduling of relation
(1). We have the following definition.
Definition 2 (Energetic edge-finding algorithm). The energetic edge-finding
algorithm receives as input a CuSP instance satisfying the necessary condition
of relation (1). It produces as output a vector
®
ELB ′ (1) , ..., ELB ′ (n)
where
ELB ′ (i) = max (ri , ELB ′ (i))
and
′
ELB (i) =
max
max
t1 < t2
Θ⊆T
∆+ (i, t1 , t2 ) > 0 rest (Θ, ci ) > 0
di > t 2
dΘ ≤ t 2
»
1
rest (Θ, ci )
rΘ +
ci
¼
with
∆+ (i, t1 , t2 ) = WSh (t1 , t2 ) − WSh (i, t1 , t2 ) + ci · p+
i (t1 ) − C · (t2 − t1 ).
and
rest (Θ, ci ) =
½
eΘ − (C − ci ) (dΘ − rΘ )
0 if not
if
Θ 6= ∅
.
We can derived for our O(n2 )-implementation of edge-finding [KF09b], a precomputation phase running in O(n2 ). In spite of our efforts, we are unable to
exhibit a quadratic algorithm to compute all the adjustments on the O(n2 ) relevant time intervals.
4.2
Specification of the relax energetic edge-finder algorithm
To provide a quadratic algorithm, we have considered a relaxation of the rule.
For a given task i, we pair attention on time intervals [t1 , t2 ] satisfying t1 = rΘ ,
t2 = dΘ and rest(Θ, ci ) > 0. Therefore, we have the following specification of
the relax version of the energetic edge-finder.
Definition 3 (Relax Energetic edge-finding algorithm). The relax energetic edge-finding algorithm receives as input a CuSP instance satisfying the
necessary condition of relation (1). It produces as output a vector
D
E
ELB1′ (1) , ..., ELB1′ (n)
58
where
ELB1′ (i) = max (ri , ELB1′ (i))
and
ELB1′ (i) =
»
¼
1
max
rΘ +
rest (Θ, ci )
ci
Θ⊆T
∆+ (i, rΘ , dΘ ) > 0
di > d Θ
rest (Θ, ci ) > 0
with
∆+ (i, rΘ , dΘ ) = WSh (rΘ , dΘ ) − WSh (i, rΘ , dΘ ) + ci · p+
i (rΘ ) − C · (dΘ − rΘ ).
and
rest (Θ, ci ) =
5
5.1
½
eΘ − (C − ci ) (dΘ − rΘ )
0 if not
if
Θ 6= ∅
.
Relax energetic edge-finder algorithm
Precomputation
We call P reComp the adapt version of our O(n2 ) edge-finding algorithm presented in [KF09b] who compute the innermost maximization of the edge-finder
specification and identify the characteristics of the corresponding task interval
use for. It return an array g of length 3n where n is the number of tasks. For a
given index i with 1 < i < n, g(i) is the value use to update the release date of
task X(i) where X is an array of tasks sorted in non-decreasing order of release
dates required as by the algorithm. g(n + i) and g(2n + i) are respectively the
release and due date of the task intervals use to compute g(i). For more detail,
see the appendix and [KF09b].
5.2
Relax energetic edge-finder algorithm
Using the precomputation phase P reComp, we can derive a second phase for
the relax form of the energetic edge-finder. This second phase runs in O(n2 ) and
perform additional adjustment.
Require: X is an array of tasks sorted by non-decreasing release dates;
Require: Y is an array of tasks sorted by non-decreasing due dates;
Ensure: ELB(x) are computed for all 1 ≤ x ≤ n.
1: g := P reComp();
2: for i := 1 to n do
3:
ELB(i) := rX(i) ;
59
4: end for
5: for i := 1 to n do
6:
if g(i) > LB(i) then
7:
t1 = g(n + i), t2 = g(2n + i);
8:
if t1 < t2 then
9:
W := 0;
10:
for x := 1 to n do
11:
if x 6= X(i) then
12:
W := W + WSh (x, t1 , t2 );
13:
end if
14:
end for
15:
if W + cX(i) p+
X(i) (t1 ) > C(t2 − t1 ) then
16:
ELB(i) = max(LB(i), g(i));
17:
end if
18:
end if
19:
end if
20: end for
Algorithm 1: CompEEF: Energetic Edge-finder algorithm in O(n2 ) and O(n)
space.
It is possible to increase the power of the algorithm without change the
complexity. Line 16 of Algorithm 1 can be replace by the following line.
16’: ELB(i) = max(LB(i), g(i), t2 + ⌈ c1i W + cX(i) p+
X(i) (t1 ) − C(t2 − t1 )⌉ − pi )
It is possible to compute the complete energetic edge-finder with the partial
elastic energy required in O(n2 log(k)). Indeed, with the partial elastic energetic
required, the time points t1 and t2 correspond to release dates and due dates of
some tasks. We can therefore derived an O(n2 log(k)) algorithm for the energetic
edge-finder since Baptiste’s algorithm runs in O(n2 log(k)) [BL99,B02].
6
Conclusion
Edge-finding, not-first/not-last, sweeping, overloading checking and energetic
reasoning are well known propagation rules used to prune the start and end
times of tasks which have to be processed without interruption on a cumulative
resource. In this paper, we have presented a new propagation rule title energetic
edge-finder who can make additional pruning. The new algorithm is organized
in two phases. The first phase precompute the innermost maximization in the
edge-finder specification and identify the characteristics of the corresponding
task interval use for. The second phase based on energetic reasoning condition,
uses precomputation to apply the actual updates. The overall complexity of our
algorithm is O(n2 ) since the running time the of each phase is O(n2 ).
60
References
[BL99] Ph. Baptiste. C. Le Pape. & W. Nuijten. Satisfiability Test and Time-Bound
Adjustements for Cumulative Scheduling Problems, Annals of Operations Research 92. 305-333. (1999).
[TL00] P. Torres and P. Lopez. On Not-First/Not-Last Conditions in Disjunctive
Scheduling. European Journal of Operational Research, Vol 127(2):332-343.
(2000).
[Vi04] P. Vilim. O (n log n) Filtering Algorithms for Unary resource Constraint. In
proceeding of CP-AI-OR. (2004).
[KF09b] R. Kameugne, L.P. Fotso and Youcheu Ngo-Kateu. Cumulative Edge-Finding
Algorithm in O(n2 ). Submitted for publication to INFORMS Journal on Computing (2009).
[KF09a] R. Kameugne, L.P. Fotso and E. Kouakam. Cumulative Not-First/Not-Last
Algorithms in O(n3 ). Submitted for publication to Journal of Scheduling,
(2009).
[MV05] L. Mercier and Pascal Van Hentenryck. Edge Finding for Cumulative Scheduling. INFORMS Journal on Computing 20(1), pp. 143-153, (2008).
[B02] Ph. Baptiste. Résultats de Complexité et Progammation par Contraintes pour
l’Ordonnancement. HDR thesis. Compiègne University of Technology. (2002).
[ELH] Esquirol P. Lopez P. et Huguet M.-J. Ordonnancement de la production,
chapitre Propagation de contraintes en ordonnancement, Hermès Science Publications, Paris, 2001.
[Lo]
Lopez P. Approche énergètique pour l’ordonnancement de tâches sous contraintes de temps et de ressources, Thèse de doctorat, Université Paul Sabatier,
Toulouse, 1991.
61
7
Appendix
Proposition 1 summarizes the main dominance properties of edge-finding rule.
Proposition 1. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets
of T . If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update
the earliest start time of i then there exists four tasks j, k, j ′ , k ′ such that rj ≤
rj ′ < dk′ ≤ dk < di ∧ rj ≤ ri and the edge-finding rule applied to task i with the
pair (Ωj,k , Ωj ′ ,k′ ) allows the same update of the earliest start time of task i.
Proof. See proof of Propositions 2, 3, 4 and 6 of [MV05].
Proposition 2. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets
of T such that Θ ⊆ Ω. If the edge-finding rule applied to task i with pair (Ω, Θ)
allows to update the earliest start time of i then there exist two tasks k and jmax1
such that:
1. dk = dΩ ∧ rjmax1 ≤ ri ;
2. ∀j ∈ T if rj ≤ ri then
eΩjmax1 ,k + Crjmax1 ≥ eΩj,k + Crj
(2)
3. The pair (Ωjmax1 ,k , Θ) allows to update the earliest start time of i.
Proof. See [KF09b].
Definition 4 (Pair of interest ). Let i be a task of an E-feasible CuSP and
Ω, Θ be two subsets of T such that Θ ⊆ Ω. A pair (Ω, Θ) is a pair of interest
to task i if and only if
α(Ω, i)
∧
di > dΩ
∧
rest (Θ, ci ) > 0
∧
rΘ +
1
rest (Θ, ci ) > ri .
ci
Definition 5. Let i, k be two tasks of an E-feasible CuSP. jmax is a task such
that ri < rjmax and for all task j with ri < rj
eΩjmax ,k
eΩj,k
≤
.
dk − rj
dk − rjmax
Proposition 3. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets
of T such that Θ ⊆ Ω. If the edge-finding rule applied to task i with pair (Ω, Θ)
allows to update the earliest start time of i then
1. if ri < rΘ there exists a task jmax such that ri < rjmax and the pair
(Ω, Ωjmax ,k ) is a pair of interest to task i where dΘ = dk .
2. if rΘ ≤ ri the pair (Ω, Ω) or (Θ, Θ) is a pair of interest to task i.
Proof. See [KF09b].
According to Proposition 3, after detecting all edge-finding conditions, we
can sometime perform a poor adjustment. This approach can be used iteratively,
until no adjustment is found (reached the fix point).
62
Require: X is an array of tasks sorted by non-decreasing release dates;
Require: Y is an array of tasks sorted by non-decreasing due dates;
Ensure: g(x) are computed for all 1 ≤ x ≤ 3n.
1: for x := 1 to 3n do
2:
g(x) := −∞;
3: end for
4: for y := 1 to n do
5:
W := 0, maxW := 0, maxEst := −∞;
6:
for x ¡:= n down to ¢1 do
7:
if dX(x) ≤ dY (y) then
8:
W ³:= W + eX(x)
´
maxW
then
9:
if dY (y)W
−rX(x) > dY (y) −maxEst
10:
maxW := W, maxEst := rX(x) ;
11:
end if
12:
else
13:
restW := maxW − (C − cX(x) )(dY (y) − maxEst);
14:
a :=
maxEst + ⌈restW/cX(x) ⌉ else −∞
¡ if (restW > 0) then
¢
15:
if a > max(g(x), rX(x) ) then
16:
g(x) = max(g(x), a), g(n + x) = maxEst, g(2n + x) = dY (y) ;
17:
end if
18:
end if
19:
W (x) := W ;
20:
end for
21:
minSL := −∞, maxEst1 := dY (y) ;
22:
for x ¡:= 1 to n do
¢
23:
if W (x) − C(dY (y) − rX(x) ) > minSL then
24:
maxEst1 := rX(x) , minSL := W (x) − C(dY (y) − maxEst1);
25:
end if
26:
if (dX(x) > dY (y) ) then
27:
restW := minSL + cX(x) (dY (y) − maxEst1);
28:
b := if (maxEst1 ≤ dY (y) ) ∧ (restW > 0) then maxEst1 + ⌈restW/cX(x) ⌉ else −∞
29:
if b > max(g(x), rX(x) ) then
30:
g(x) = max(g(x), b), g(n + x) = maxEst1, g(2n + x) = dY (y) ;
31:
end if
32:
end if
33:
end for
34: end for
Algorithm 2: PreComp: Edge-finding precomputation algorithm in O(n2 ) and
O(n) space
This idea was already used for the disjunctive and cumulative Not-First/NotLast rule [TL00,Vi04,KF09a]. We have the following precomputation phase.
63
Theorem 1. Algorithm 2 computed iteratively the innermost maximization of
the edge-finding specification and identify the characteristics of the corresponding
task intervals use for. It runs in O(n2 ) time and uses O(n) space.
Proof. Direct consequence of Proposition 3. The preprocessing time for sorting
the arrays X and Y is O(nlog(n)). The first loop 1-3 runs in O(n). The second
loop 4-34 contents two inner loop 6-20 and 22-33 and each inner loop runs in
O(n2 ). Therefore the overall complexity of Algorithm 2 is O(n2 ).
In algorithm 2,
• The first main loop (line 1-3) initializes array g.
• The second main loop (line 4-34) iterates over all due dates of tasks in non
decreasing order.
• The first inner loop (6-20) iterates over all release dates sorted in non increase
order and identify the task jmax of Definition 5. Lines 15-17 identify and
update the characteristics of the potential task interval that can be use to.
• The inner loop (22-33) iterates over all release dates sorted in non decrease
order and identify the task jmax1 of Proposition 2. Lines 29-31 identify and
update the characteristics of the potential task interval that can be use to.
64
Dominion
A constraint solver generator
Lars Kotthoff (student)
supervised by Ian Miguel and Ian Gent
{larsko,ianm,ipg}@cs.st-andrews.ac.uk
University of St Andrews
Abstract This paper proposes a design for a system to generate constraint solvers that are specialised for specific problem models. It describes the design in detail and gives preliminary experimental results
showing the feasibility and effectiveness of the approach.
1
Introduction
Currently, applying constraint technology to a large, complex problem requires
significant manual tuning by an expert. Such experts are rare. The central aim
of this project is to improve the scalability of constraint technology, while simultaneously removing its reliance on manual tuning by an expert. We propose
a novel, elegant means to achieve this – a constraint solver synthesiser, which
generates a constraint solver specialised to a given problem. Constraints research
has mostly focused on the incremental improvement of general-purpose solvers so
far. The closest point of comparison is currently the G12 project [1], which aims
to combine existing general constraint solvers and solvers from related fields into
a hybrid. There are previous efforts at generating specialised constraint solvers
in the literature, e.g. [2]; we aim to use state-of-the-art constraint solver technology employing a broad range of different techniques. Synthesising a constraint
solver has two key benefits. First, it will enable a fine-grained optimisation not
possible for a general solver, allowing the solving of much larger, more difficult
problems. Second, it will open up many new research possibilities. There are
many techniques in the literature that, although effective in a limited number
of cases, are not suitable for general use. Hence, they are omitted from current
general solvers and remain relatively undeveloped. Among these are for example
conflict recording [3], backjumping [4], singleton arc consistency [5], and neighbourhood inverse consistency [6]. The synthesiser will select such techniques as
they are appropriate for an input problem. Additionally, it can also vary basic
design decisions, which can have a significant impact on performance [7].
The system we are proposing in this paper, Dominion, implements a design
that is capable of achieving said goals effectively and efficiently. The design
decisions we have made are based on our experience with Minion [9] and other
constraint programming systems.
The remainder of this paper is structured as follows. In the next section, we
describe the design of Dominion and which challenges it addresses in particular.
We then present the current partial implementation of the proposed system and
give experimental results obtained with it. We conclude by proposing directions
for future work.
65
Analyser
Solver
specification
Generator
Specialised
solver
problem model
Solution
Figure 1. Components and flow of information in Dominion. The part above the
dashed line is the actual Dominion system. The dotted arrow from the problem model
to the specialised solver designates that either the model is encoded entirely in the
solver such that no further information is required to solve the problem, or the solver
requires further input such as problem parameters.
2
Design of a synthesiser for specialised constraint solvers
The design of Dominion distinguishes two main parts. The analyser analyses
the problem model and produces a solver specification that describes what components the specialised solver needs to have and which algorithms and data
structures to use. The generator takes the solver specification and generates a
solver that conforms to it. The flow of information is illustrated in Figure 1.
Both the analyser and the generator optimise the solver. While the analyser
performs the high-level optimisations that depend on the structure of the problem model, the generator performs low-level optimisations which depend on the
implementation of the solver. Those two parts are independent and linked by the
solver specification, which is completely agnostic of the format of the problem
model and the implementation of the specialised solver. There can be different
front ends for both the analyser and the generator to handle problems specified
in a variety of formats and specialise solvers in a number of different ways, e.g.
based on existing building blocks or synthesised from scratch.
2.1
The analyser
The analyser operates on the model of a constraint problem class or instance. It
determines the constraints, variables, and associated domains required to solve
the problem and reasons about the algorithms and data structures the specialised
solver should use. It makes high-level design decisions, such as whether to use
trailing or copying for backtracking memory. It also decides what propagation
algorithms to use for specific constraints and what level of consistency to enforce.
The output of the analyser is a solver specification that describes all the
design decisions made. It does not necessarily fix all design decisions – it may
use default values – if the analyser is unable to specialise a particular part of the
solver for a particular problem model.
In general terms, the requirements for the solver specification are that it
(a) describes a solver which is able to find solutions to the analysed problem
66
model and (b) describes optimisations which will make this solver perform better
than a general solver.
The notion of better performance includes run time as well as other resources
such as memory. It is furthermore possible to optimise with respect to a particular resource; for example a solver which uses less memory at the expense of run
time for embedded systems with little memory can be specified.
The solver specification may include a representation of the original problem
model such that a specialised solver which encodes the problem can be produced
– the generated solver does not require any input when run or only values for the
parameters of a problem class. It may furthermore modify the original model in
a limited way; for example split variables which were defined as one type into
several new types. It does not, however, optimise it like for example Tailor [8].
The analyser may read a partial solver specification along with the model of
the problem to be analysed to still allow fine-tuning by human experts while not
requiring it. This also allows for running the analyser incrementally, refining the
solver specification based on analysis and decisions made in earlier steps.
The analyser creates a constraint optimisation model of the problem of specialising a constraint solver. The decision variables are the design decisions to
be made and the values in their domains are the options which are available
for their implementation. The constraints encode which parts are required to
solve the problem and how they interact. For example, the constraints could
require the presence of an integer variable type and an equals constraint which
is able to handle integer variables. A solution to this constraint problem is a
solver specification that describes a solver which is able to solve the problem
described in the original model. The weight attached to each solution describes
the performance of the specialised solver and could be based on static measures
of performance as well as dynamic ones; e.g. predefined numbers describing the
performance of a specific algorithm and experimental results from probing a
specific implementation.
This metamodel enables the use of constraint programming techniques for
generating the specialised solver and ensures that a solver specification can be
created efficiently even for large metamodels.
The result of running the analyser phase of the system is a solver specification
which specifies a solver tailored to the analysed problem model.
2.2
The generator
The generator reads the solver specification produced by the analyser and constructs a specialised constraint solver accordingly. It may modify an existing
solver, or synthesise one from scratch. The generated solver has to conform to
the solver specification, but beyond that, no restrictions are imposed. In particular, the generator does not guarantee that the generated specialised solver
will have better performance than a general solver, or indeed be able to solve
constraint problems at all – this is encoded in the solver specification.
In addition to the high-level design decisions fixed in the solver specification,
the generator can perform low-level optimisations which are specific to the im-
67
plementation of the specialised solver. It could for example decide to represent
domains with a data type of smaller range than the default one to save space.
The scope of the generator is not limited to generating the source code which
implements the specialised solver, but also includes the system to build it.
The result of running the generator phase of the system is a specialised solver
which conforms to the solver specification.
3
Preliminary implementation and experimental results
We have started implementing the design proposed above in a system which
operates on top of Minion [9]. The analyser reads Minion input files and writes a
solver specification which describes the constraints and the variable types which
are required to solve the problem. It does not currently create a metamodel of
the problem. The generator modifies Minion to support only those constraints
and variable types. It furthermore does some additional low-level optimisations
by removing infrastructure code which is not required for the specialised solver.
The current implementation of Dominion sits between the existing Tailor and
Minion projects – it takes Minion problem files, which may have been generated
by Tailor, as input, and generates a specialised Minion solver.
The generated solver is specialised for models of problem instances from the
problem class the analysed instance belongs to. The models have to be the same
with respect to the constraints and variable types used.
Experimental results for models from four different problem classes are shown
in Figure 2. The graph only compares the CPU time Minion and the specialised
solver took to solve the problem; it does not take into account the overhead of
running Dominion – analysing the problem model, generating the solver, and
compiling it, which was in the order of a few minutes for all of the benchmarks.
The problem classes Balanced Incomplete Block Design, Golomb Ruler, nQueens, and Social Golfers were chosen because they use a range of different
constraints and variable types. Hence the optimisations Dominion can perform
are different for each of these problem classes. This is reflected in the experimental results by different performance improvements for different classes.
Figure 2 illustrates two key points. The first point is that even a quite basic
implementation of Dominion which does only a few optimisations can yield significant performance improvements over standard Minion. The second point is
that the performance improvement does not only depend on the problem class,
but also on the instance, even if no additional optimisations beyond the class
level were performed. For both the Balanced Incomplete Block Design and the
Social Golfers problem classes the largest instances yield significantly higher
improvements than smaller ones.
At this stage of the implementation, our aim is to show that a specialised
solver can perform better than a general one. We believe that Figure 2 conclusively shows that. As the problem models become larger and take longer to solve,
the improvement in terms of absolute run time difference becomes larger as well.
Hence the more or less constant overhead of running Dominion is amortised for
larger and more difficult problem models, which are our main focus. Generating
1.0
2922
7,3,60
●
10 7,3,50
●
7,3,40
24
28
●
12
11
13
20
0.9
● 26 27
7,3,30
7,3,70
●
2,9,4
0.8
2,6,4
2,7,4
2,8,4
●
0.7
CPU time of the specialised solver relative to Minion
68
2,10,4
1
10
100
BIBD
(v,k,lambda)
Golomb Ruler
(ticks)
n−Queens
(queens)
Social Golfers
(weeks,groups,players)
1000
10000
CPU time [seconds]
Figure 2. Preliminary experimental results for models of instances of four problem
classes. The x axis shows the time standard Minion took to solve the respective instance. The labels of the data points show the parameters of the problem instance,
which are given in parentheses in the legend. The times were obtained using a development version of Minion which corresponds to release 0.8.1 and Dominion-generated
specialised solvers based on the same version of Minion. Symbols below the solid line
designate problem instances where the Dominion-generated solver was faster than Minion. The points above the line are not statistically significant; they are random noise.
The dashed line designates the median for all problem instances.
a specialised solver for problem classes and instances is always going to entail
a certain overhead, making the approach infeasible for small and quick-to-solve
problems.
4
Conclusion and future work
We have described the design of Dominion, a solver generator, and demonstrated
its feasibility by providing a preliminary implementation. We have furthermore
demonstrated the feasibility and effectiveness of the general approach of generating specialised constraint solvers for problem models by running experiments
69
with Minion and Dominion-generated solvers and obtaining results which show
significant performance improvements. These results do not take the overhead
of running Dominion into account, but we are confident that for large problem
models there will be an overall performance improvement despite the overhead.
Based on our experiences with Dominion, we propose that the next step
should be the generation of specialised variable types for the model of a problem
instance. Dominion will extend Minion and create variable types of the sort
“Integer domain ranging from 10 to 22”. This not only allows us to choose
different representations for variables based on the domain, but also to simplify
and speed up services provided by the variable, such as checking the bounds of
the domain or checking whether a particular value is in the domain.
The implementation of specialised variable types requires generating solvers
for models of problem instances because the analysed problem model is essentially rewritten. The instance the solver was specialised for will be encoded in it
and no further input will be required to solve the problem. We expect this optimisation to provide an additional improvement in performance which is more
consistent across different problem classes, i.e. we expect significant improvements for all problem models and not just some.
We are also planning on continuing to specify the details of Dominion and
implementing it.
5
Acknowledgements
The authors thank Chris Jefferson for extensive help with the internals of Minion
and the anonymous reviewers for their feedback. Lars Kotthoff is supported by
a SICSA studentship.
References
1. Stuckey, P.J., de la Banda, M.J.G., Maher, M.J., Marriott, K., Slaney, J.K., Somogyi, Z., Wallace, M., Walsh, T.: The G12 project: Mapping solver independent
models to efficient solutions. In: ICLP 2005. 9–13
2. Minton, S.: Automatically configuring constraint satisfaction programs: A case
study. Constraints 1 (1996) 7–43
3. Katsirelos, G., Bacchus, F.: Generalized nogoods in CSPs. In: AAAI 2005. 390–396
4. Prosser, P.: Hybrid algorithms for the constraint satisfaction problem. Computational Intelligence 9(3) (1993) 268–299
5. Bessière, C., Debruyne, R.: Theoretical analysis of singleton arc consistency and its
extensions. Artificial Intelligence 172(1) (2008) 29–41
6. Freuder, E.C., Elfe, C.D.: Neighborhood inverse consistency preprocessing. In:
AAAI 1996. 202–208
7. Kotthoff, L.: Constraint solvers: An empirical evaluation of design decisions. CIRCA
preprint (2009) http://www-circa.mcs.st-and.ac.uk/Preprints/solver-design.pdf.
8. Rendl, A., Gent, I.P., Miguel, I.: Tailoring solver-independent constraint models: A
case study with Essence’ and Minion. In: SARA 2007. 184–199
9. Gent, I.P., Jefferson, C., Miguel, I.: MINION: A fast scalable constraint solver. In:
ECAI 2006. 98–102
70
On learning CSP specifications
Matthieu Lopez⋆1 and Arnaud Lallouet⋆⋆2
1
2
1
Université d’Orléans — LIFO
BP6759, F-45067 Orléans
Matthieu.Lopez@univ-orleans.fr
Université de Caen-Basse Normandie — GREYC
BP 5186 - 14032 Caen
arnaud.lallouet@info.unicaen.fr
Introduction
Constraint Satisfaction Problems (CSP) concept enables to model a wide range
of decision problems, from arithmetic puzzles to scheduling problems. Even if the
goal of constraint programming (CP) is to provide a simple way to formulate problems, experience shows that a fair expertise is required to perform the modeling
task[Fre97]. For this reason, machine learning of CSP models has been seriously
studied in the literature. In this paper, we consider two objects: a model which
corresponds to a CSP and a specification. Our objective is to propose a way to
automatically specify a problem by machine learning with instances of different
size.
Model learning, also named constraints network learning, involves obtaining a
set of constraints from a set of examples of the model (solutions and no-solutions
of the model). A model is a tuple (V, D, C) where V is a set of variables, D a
set of domains for the variables and C a set of constraints. The couple (V, D)
defines the viewpoint of the model. In particular, the viewpoint (at least variables)
is similar for all examples and they are all of the same size. CONACQ[BCKO06],
based on version space, is an example of algorithm aiming to learn a CSP model.
From solutions and no-solutions of a problem where the variables of the CSP are
given, and a set of possible constraints are given, it tests, for each constraint, if the
constraint is necessary to the model.
A specification of a problem is a formalization more abstract than a model.
Consider for example the n-queens problem. This problem requires putting n queens
on a n×n chessboard such that no queen attacks another queen. There exists models
for the 8-queens, 12-queens, etc, whereas a specification aims to formalize directly
the n-queens problem. We say that a model represents an instance of a problem.
To obtain a model from a specification, additional data are needed, like the number
of variables and their domains. This paper deals with specification learning rather
than model learning. To our mind, a specification is more natural for users who do
not know CP because providing examples of instances already solved is an easier
task than providing examples for the current instance to be solved. Let imagine
that we want to obtain a school timetable for this year. In the past years, we have
proceeded manually. Thus, we have already a set of solutions for past instances of
this problem but they may be of different size because of the number of groups,
teachers or rooms available.
In this paper, we propose a language, a subset of first order logic, allowing to
describe a subset of problems which could be handled in CP. Let’s take as example
the graph coloring problem. Given a set S of vertices of a graph G, and a set C
of colors, the problem is to find a coloration of each vertex such that when two
⋆
⋆⋆
the student
the supervisor
vertices are adjacent, they have not the same color. We can specify the
71problem
in the following way, where adj describing the relation of neighborhood, and color
define the color of vertices:
∀X, Y ∈ S, ∀A, B ∈ C : adj(X, Y ) ∧ color(X, A) ∧ color(Y, B) → A 6= B
^
∀X ∈ S, ∀A, B ∈ C : color(X, A) ∧ color(X, B) → A = B
We will detail this language in a first part. Next, we present how to obtain a naive
model from a specification written in our language. After a formalization of inductive
logic programming (ILP), we show how our learning task can be expressed like a
standard ILP problem. Finally, we present limits of state-of-art learning algorithms.
2
Specification language
Since a few years, the number of high-level specification languages has highly increased ([FHJ+ 08, MNR+ 08, NSB+ 07a, AA06, PF03, Hni03]). Observing these languages, we notice that their principal interest is to work with collections of variables
for which the size is unknown and for which their domain is not specified. Then this
abstraction is solved with a data file wherein is given the missing informations to
create an instance of the problem (translation from n-queens to 8-queens). However, these languages have been written to facilitate the modeling for human users
and so, their complexity is such that they can not be used for target in a learning
algorithm.
In our language, a specification is a conjunction of rules. These rules describe the
way a constraint must be posted in an instance of the problem. The vocabulary of
our language is composed of a finite set of symbols of predicates with fixed arity, a set
of domain composed of constants and a set of variables (Warning: these variables and
these domains are not the ones of the CSP). Predicates are split in two categories.
Some are called description predicates and they are used to describe observation in
our problem (e.g. in n-queens problem, to describe positions of queens). Other are
named constraint predicates and represent a constraint type (e.g. : =, 6=, bef ore, . . .).
An atom is an expression of the form P (t1 , . . . , tk ), where P is a k-arity predicate
and t1 , . . . , tk are terms, i.e. variables or constants. We call description atom an
atom whose predicate is a description predicate and in a similar way, a constraint
atom an atom whose predicate is a constraint. The syntax of a rule is:
rule ::= ∀ variables : body → head
variables ::= vs ∈ DOMAIN | variables, variables
vs ::= VARIABLE | vs, vs
body ::= DESC ATOM | CONSTRAINT ATOM | body ∧ body
head ::= CONSTRAINT ATOM | head
2.1
From specification to model
In order to translate a specification to a constraint network, we need other informations not present in the specification. In the example of graph coloring, the
specification describes a representation of any well colored graph. To color a graph,
we need the set of vertices and the set of colors. We also need to know the neighborhood of the vertices. Thus, to translate a specification, we must provide different
informations like the different sets corresponding to domains of variables in the
specification
72 and “extensions” of certain predicates. The problem will be to find the
“extension” of other predicates.
In the following, we denote by E the set of predicates whose extension is given
by the user and V the set of predicates whose extension must be searched. Rather
than translating the specification in a low-level language, we prefer to translate, for
sake of clarity, to a model in MiniZinc[NSB+ 07b], language which can be further
translated in low-level language.
Let substL=[X|v,...] (A) the operator which substitutes in an atom A all occurrences of variables X in L with the associated value v. For each predicate P : K1 ×
. . .×Kk in V, we must create the following CSP variables: array[K1 , . . . , Kk ] of boolean : P;
Next, each rule R : A1 . . . Al → H in our specification can be translated as:
constraint
forall(v1 in K1 )( ... forall(vi in Kk )(
subst[X1 |v1 ,...,Xi |vi ] (A1 )/\ . . . /\subst[X1 |v1 ,...,Xi |vi ] (Al )
-> subst[X1 |v1 ,...,Xi |vi ] (H)
)...)
Example Let’s take the graph coloring example.
Generated variables would be: array[S, C] of boolean : Color;
and an example of constraints:
constraint forall(v1 in S)(forall(v2 in S)(
forall(v3 in C)(forall(v4 in C)(
adj(v1 , v2 )/\Color(v1 , v3 )/\Color(v2 , v4 ) -> v3 6= v4
))
));
Remark If we have more informations about predicates, like color is a function,
we can easily obtain a better model with another choice for the viewpoint (creating
one variable by vertex with colors as domain).
3
Inductive logic programming
Inductive logic programming (ILP)[Mug95] is a field at the crossroad of logic programming and machine learning. Given sets of positive examples E + and negative
examples E − of a concept C, a background knowledge B and a description language
L, the goal is to find a definition H of the concept C, described with respect to L in
the form of a logic program. H must cover all examples from E + and must reject
examples from E − .
We will define the form of H and B, and then the coverage notion and the
nature of concept C in a basic kind of ILP without negation and recursion. A
clause(resp. query) is a disjunction(resp. conjunction) of literals, i.e. atoms or negation of atoms. A Horn clause is a clause containing only one positive literal. Its
form is: ∃X1 , X2 , . . . : H ← B1 ∧ B2 ∧ . . . ∧ Bn where Xi is a variable, the literal
H is called the head, and the conjunction of literals B1 ∧ B2 ∧ . . . ∧ Bn is called
the body. The hypothesis H is a disjunction of Horn clauses. In a similar way, B is
a set of Horn clauses describing observations about examples and new predicates
defined intentionally. We say that H covers an example e if H ∪ B |= e. The concept
C corresponds to an undefined predicate. Thus the problem consists in learning a
definition H for C such that examples are correctly covered.
4
Specification rule learning
In this section, we will present our learning problem as an ILP problem like previously defined. We recall a specification in our language is a conjuction of rules such
that:
73
(∀X11 , . . . : A11 ∧ . . . ∧ A1k → A1k+1 ) ∧ . . . ∧ (∀X1n , . . . : An1 ∧ . . . ∧ Anl → Anl+1 )
The rules can be written as disjunctions:
(∀X11 , . . . : ¬A11 ∨ . . . ∨ ¬A1k ∨ A1k+1 ) ∧ . . . ∧ (∀X1n , . . . : ¬An1 ∨ . . . ∨ ¬Anl ∨ Anl+1 )
The first step consists in setting what concept we aim to learn. Let spec(pb), the
target concept where pb is a key to an example.
Previously, we have seen hypotheses and so the definition of concept is in the
form of a set of clauses. But our specification is in the form of conjunction of queries.
However if we search the inverse concept[Lae02], we can easily get back to the
framework previously described. Indeed, inverting connectors ∨/∧ and quantifiers
∀/∃ of a definition of concept, we obtain a definition for the inverse concept. Let
negSpec(pb) the negated concept. Its definition would be:
∃X11 , . . . : negSpec(pb) ← A11 ∧ . . . ∧ A1k ∧ A1k+1
...
∃X1n , . . . : negSpec(pb) ← An1 ∧ . . . ∧ Anl ∧ Anl+1
Passing to the negated concept and thus searching hypotheses describing the inverted problem, our problem of learning specification can be defined in this way:
given a set of solutions G of our problem and a set of no-solutions N G, a background
knowledge B = O ∪ I, where O is a set of observations about examples (the color
of vertices for example) and I a set of constraint predicates given intentionally, the
problem is to find a definition H for negSpec such that H covers all examples of N G
and rejects all examples of G with respect to B.
5
Difficulties to learn and perspectives
ILP problems can be viewed as search problems. A language L sets the possible
hypotheses which can be built, for our problem a conjunction of observation and
constraint atoms. Thus, states of our search space are possible hypotheses with
respect to L. Defined in this way, our search space is clearly infinite. To limit
the search space, we can set, for example, the number of atoms or variables in
hypotheses. Furthermore the search space can be structured with a partial order.
Because of lack of space, we do not detail this part, but the lector can read [LD93].
The idea is to structure the space in a lattice where hypotheses are ordered by a
relation of generality. A majority of state-of-the-art algorithms are based on this
lattice. To move in this lattice, we need two operators, one to generalize a hypothesis
and another to specialize a hypothesis. These operators enable to travel in the
search space, either in a complete way, enumerating all possible hypotheses, or with
incomplete methods using search strategies like hill-climbing or beam search.
To our problem, the complexity of coverage test make prohibitive the use of complete methods. Thus, we have been interested in incomplete methods. Two principal
ways have been explored in rule learning. The first one consists in starting with the
rule always true (⊤) and to specialize it step by step, i.e. to add new atoms. It
is the reason why a specialization operator is required to choose the best atom to
add to our hypothesis. To define it, we need a criterion, generally based on the
coverage of examples by the new hypothesis. But for our rule, we did not manage
to find such a criterion. Indeed, if we observe our rules, for example coloring graph
problem, adding col(X, C) or any other atom to the rule adj(X, Y ) do not change
the coverage
74 of the rule which covers all positive and negative examples. Indeed, we
observe that for all studied specifications, the suppression of one literal in rules of
specification makes unusable the coverage criterion. The second method consists in
starting from the other extremity of the lattice, emphi.e. the most specific hypothesis. [Mug95] proposes to set the most specific clause from an example extended
with the background knowledge and where all constants are replaced by variables.
This extension is configured by a depth level, limiting the creation of new variables.
If this clause is correctly extended, it rejects all negative examples, and covers at
least the positive example which has been used to build it. With the generalization
operator, we can search a hypothesis covering more positive examples. The principal
flaw of this method is to use the coverage test with hypotheses of large size and
thus is unusable in practice. Indeed, in certain problems, we have clauses of more
than 4000 atoms.
We are now building an alternative method aiming to avoid extremities of the
lattice which are unusable. The idea is to find, with low cost, an hypothesis located
in an area of the lattice where we can move in a relevant way, i.e. traveling across
the different states of our search space in order to launch a local search.
Bibliography
75
[AA06] H. Ahriz and I. Arana. Specifying constraint problems with z. Technical
report, The Robert Gordon University, 2006.
[BCKO06] Christian Bessière, Remi Coletta, Frédéric Koriche, and Barry
O’Sullivan. Acquiring constraint networks using a sat-based version
space algorithm. In AAAI, 2006.
[FHJ+ 08] Alan Frisch, Warwick Harvey, Chris Jefferson, Bernadette Martı́nezHernández, and Ian Miguel. Essence : A constraint language for specifying combinatorial problems. Constraints, 13(3):268–306, 2008.
[Fre97] Eugene C. Freuder. In pursuit of the holy grail. Constraints, 2(1):57–61,
1997.
[Hni03] B. Hnich. Thesis: Function variables for constraint programming. AI
Commun., 16(2):131–132, 2003.
[Lae02] Wim Van Laer. From Propositional to First Order Logic in Machine
Learning and Data Mining. PhD thesis, Katholieke Universiteit Leuven,
June 2002.
[LD93] Nada Lavrac and Saso Dzeroski. Inductive Logic Programming: Techniques and Applications. Routledge, New York, NY, 10001, 1993.
[MNR+ 08] Kim Marriott, Nicholas Nethercote, Reza Rafeh, Peter J. Stuckey,
Maria Garcia de la Banda, and Mark Wallace. The design of the zinc
modelling language. Constraints, 13(3):229–267, 2008.
[Mug95] S. Muggleton. Inverse entailment and progol. New Generation Computing, Special issue on Inductive Logic Programming, 13(3-4):245–286,
1995.
[NSB+ 07a] N. Nethercote, P. J. Stuckey, R. Becket, S. Brand, G. J. Duck, and
G. Tack. Minizinc: Towards a standard cp modelling language. In Christian Bessière, editor, Thirteenth International Conference on Principles
and Practice of Constraint Programming, Lecture Notes in Computer
Science, Providence, RI, USA, sep 2007. Springer-Verlag.
[NSB+ 07b] Nicholas Nethercote, Peter J. Stuckey, Ralph Becket, Sebastian Brand,
Gregory J. Duck, and Guido Tack. Minizinc: Towards a standard cp
modelling language. In CP, pages 529–543, 2007.
[PF03] M. Å gren P. Flener, J. Pearson. The syntax, semantics, and type
system of esra. Research report, ASTRA, April 2003.
76
Pr♦♣❛❣❛t✐♥❣ ❡q✉❛❧✐t✐❡s ❛♥❞ ❞✐s❡q✉❛❧✐t✐❡s
◆❡✐❧ ❈✳❆✳ ▼♦♦r❡✱ s✉♣❡r✈✐s❡❞ ❜② ■❛♥ ●❡♥t ❛♥❞ ■❛♥ ▼✐❣✉❡❧
❙❝❤♦♦❧ ♦❢ ❈♦♠♣✉t❡r ❙❝✐❡♥❝❡✱ ❯♥✐✈❡rs✐t② ♦❢ ❙t ❆♥❞r❡✇s✱ ❙❝♦t❧❛♥❞
✶
■♥tr♦❞✉❝t✐♦♥
❚❤✐s ♣❛♣❡r ❞❡s❝r✐❜❡s ❛♥ ✐❞❡❛ t♦ ✐♠♣r♦✈❡ ♣r♦♣❛❣❛t✐♦♥ ✐♥ ❛ ❝♦♥str❛✐♥t s♦❧✈❡r✳
❚❤❡ ♠♦st ❝♦♠♠♦♥ ❢r❛♠❡✇♦r❦s ❢♦r ♣r♦♣❛❣❛t✐♦♥ ❛r❡ ❜❛s❡❞ ♦♥ ❆❈✺ ❬✺❪ ✇❤❡r❡
♣r♦♣❛❣❛t♦rs ❝❛♥ ❛❝❝❡♣t ❜♦✉♥❞ ❝❤❛♥❣❡❞✱ ❛ss✐❣♥♠❡♥t ❛♥❞ ✈❛❧✉❡ r❡♠♦✈❛❧ ❡✈❡♥ts✳
❚❤❡ ✐❞❡❛ ✐s t♦ ❡①t❡♥❞ t❤✐s t♦ ❛❧❧♦✇ ❝♦♥str❛✐♥ts t♦ ❣❡♥❡r❛t❡ ❛♥❞ ♣r♦❝❡ss ❡q✉❛❧✐t②
❛♥❞ ❞✐s❡q✉❛❧✐t② ❡✈❡♥ts ❛s ✇❡❧❧✱ ❡✳❣✳✱ r → x = y ❝❛♥ ❣❡♥❡r❛t❡ ❛♥ ❡q✉❛❧✐t② ❡✈❡♥t
✇❤❡♥❡✈❡r r = true✱ ♠❡❛♥✐♥❣ t❤❛t ❢r♦♠ ♥♦✇ ♦♥ ✐♥ ❛♥② s♦❧✉t✐♦♥ x = y ✳ ❊✈❡r②
❝♦♥str❛✐♥t ✐♥ t❤❡ ♠✐♥✐♦♥ s♦❧✈❡r ❬✸❪ ❝❛♥ ♣r♦❞✉❝❡ s✉❝❤ ❡✈❡♥ts ❛♥❞ ❜❡♥❡✜t ✐♥ s♦♠❡
❝✐r❝✉♠st❛♥❝❡s ❢r♦♠ r❡❝❡✐✈✐♥❣ t❤❡♠✳
■♥ t❤✐s ♣❛♣❡r ■ ✇✐❧❧ ❣✐✈❡ ❛♥ ❡①❛♠♣❧❡ ✇❤❡r❡ t❤❡ ✐❞❡❛ ✇♦r❦s s✉❝❝❡ss❢✉❧❧②✱ ❞❡s❝r✐❜❡
t❤❡ ❛❞❞✐t✐♦♥❛❧ ♣r♦♣❛❣❛t✐♦♥ t❤❛t ✈❛r✐♦✉s ♣r♦♣❛❣❛t♦rs ✐♥ ♠✐♥✐♦♥ ❝❛♥ ❛❝❤✐❡✈❡✱ ❛♥❞
❞❡s❝r✐❜❡ ✐♠♣❧❡♠❡♥t❛t✐♦♥ ✐ss✉❡s✳
✷
❊①❛♠♣❧❡
❈♦♥s✐❞❡r t❤❡ ❢♦❧❧♦✇✐♥❣ ❈❙P✿ ✈❛r✐❛❜❧❡s x1 , . . . , xn , y1 , . . . , yn ❡❛❝❤ ✇✐t❤ ❞♦♠❛✐♥
{1, . . . , n} ❛♥❞ ❝♦♥str❛✐♥ts x1 = y1 ✱ xn =
6 yn ❛♥❞ ∀i✱ xi = yi ⇔ xi+1 = yi+1 ✳
❈❤r✐s ❏❡✛❡rs♦♥ ❤❛s ♣r♦✈❡❞ ❬✻❪ t❤❛t ✇✐t❤ ❛♥② st❛t✐❝ ✈❛r✐❛❜❧❡ ♦r❞❡r✐♥❣✱ ✐t t❛❦❡s
t✐♠❡ ❡①♣♦♥❡♥t✐❛❧ ✐♥ n ❢♦r ❜❛❝❦tr❛❝❦✐♥❣ s❡❛r❝❤ ❛♥❞ ♣r♦♣❛❣❛t✐♦♥ t♦ ♣r♦✈❡ t❤❛t ♥♦
s♦❧✉t✐♦♥ ❡①✐sts✳ ❍♦✇❡✈❡r ❜② ♣r♦♣❛❣❛t✐♥❣ ❡q✉❛❧✐t✐❡s ❛♥❞ ❞✐s❡q✉❛❧✐t✐❡s t❤❡ ❢♦❧❧♦✇✐♥❣
✇✐❧❧ ♦❝❝✉r✿
✶✳
✷✳
✸✳
✹✳
✺✳
x1 = y1 ✇✐❧❧ ❣❡♥❡r❛t❡ ❡✈❡♥t x1 = y1
x1 = y1 ⇔ x2 = y2 ✇✐❧❧ r❡❝❡✐✈❡ t❤❡ ❡✈❡♥t ❛♥❞ ♣r♦❞✉❝❡ x2 = y2
...
xn−1 = yn−1 ⇔ xn = yn ✇✐❧❧ r❡❝❡✐✈❡ xn−1 = yn−1 ❛♥❞ ♣r♦❞✉❝❡ xn = yn
❝♦♥str❛✐♥t xn 6= yn ✇✐❧❧ r❡❝✐❡✈❡ t❤❡ ❡✈❡♥t xn = yn ❛♥❞ t❤✉s ❢❛✐❧
❆s ■ s❤♦✇ ✐♥ ❙❡❝t✐♦♥ ✹ t❤✐s ❝❛♥ ❜❡ ✐♠♣❧❡♠❡♥t❡❞ ✐♥ ♣♦❧②♥♦♠✐❛❧ t✐♠❡ ❛♥❞ ❤❡♥❝❡
❛ s✐❣♥✐✜❝❛♥t s♣❡❡❞✉♣ ✐s ❛❝❤✐❡✈❛❜❧❡ ♦♥ t❤✐s t②♣❡ ♦❢ ✐♥st❛♥❝❡✳ ■ ❤❛✈❡ ✐♠♣❧❡♠❡♥t❡❞
t❤✐s ✐❞❡❛ ❛♥❞ ❛❝❤✐❡✈❡❞ ❡①♣♦♥❡♥t✐❛❧ s♣❡❡❞✉♣s ❛s ❡①♣❡❝t❡❞✳ ■ ❤♦♣❡ t♦ ❜❡ ❛❜❧❡ t♦
♣r❡s❡♥t ♣r❛❝t✐❝❛❧ r❡s✉❧ts ♦♥ t❤✐s ❛♥❞ ♦t❤❡r ♠♦r❡ r❡❛❧✐st✐❝ ✐♥st❛♥❝❡s ❞✉r✐♥❣ ♠②
❞♦❝t♦r❛❧ ♣r♦❣r❛♠♠❡ t❛❧❦✳
✸
❚❤❡ t❡❝❤♥✐q✉❡
■ ♣r♦♣♦s❡ t♦ ❛❧❧♦✇ ♣r♦♣❛❣❛t♦rs t♦ ❣❡♥❡r❛t❡ ❛♥❞ r❡❝❡✐✈❡ ❡✈❡♥ts ♦❢ t❤❡ ❢♦r♠ x = y
❛♥❞ x 6= y ✇❤❡r❡ x ❛♥❞ y ❛r❡ ✈❛r✐❛❜❧❡s✳ Pr♦♣❛❣❛t♦rs ❛r❡ ♥♦t ❛❜❧❡ t♦ ❞❡t❡❝t s✉❝❤
77
❡✈❡♥ts ❥✉st ❜② ✐♥s♣❡❝t✐♥❣ t❤❡ s♦❧✈❡r st❛t❡✱ ❜❡❝❛✉s❡ t❤❡② ♠❛② ❜❡ tr✉❡ ✇✐t❤♦✉t ❜❡✐♥❣
❡♥t❛✐❧❡❞ ❜② t❤❡ st♦r❡✳ ❋♦r ❡①❛♠♣❧❡✱ ✐♥ ❛ ❈❙P ❝♦♥t❛✐♥✐♥❣ x = y ✇✐t❤ ❞♦♠(x) =
❞♦♠(y) = {1, 2, 3} t❤❡ ♣r♦♣❛❣❛t♦r ❝❛♥ r❡♠♦✈❡ ♥♦ ✈❛❧✉❡s✳ x = y ✐s tr✉❡ ✐♥ ❛♥②
s♦❧✉t✐♦♥ ❜✉t ♥♦t ❡♥t❛✐❧❡❞ ❜② t❤❡ ❝♦♥str❛✐♥t st♦r❡✳ ❈♦♥✈❡rs❡❧② t❤❡ ❆❈✺✲st②❧❡ ❡✈❡♥ts
❝❛♥ ❛❧✇❛②s ❜❡ ❞❡t❡❝t❡❞ ❜② ♣r♦♣❛❣❛t♦rs ✐♥s♣❡❝t✐♥❣ t❤❡ ❞♦♠❛✐♥ st❛t❡✳
❆s ❛ ✜rst ❡①❛♠♣❧❡✱ ■ ❡①❤✐❜✐t ❛ r✉❧❡ ❢♦r ❝♦♥str❛✐♥t ❛❜s ✭❞❡✜♥❡❞ x = |y|✮✿
❚❤❡♦r❡♠ ✶✳ ■♥ ❝♦♥str❛✐♥t x = |y|, ✐❢ x = y t❤❡♥ ✇❡ ❝❛♥ ✐♥❢❡r t❤❛t y ≥ 0✳
Pr♦♦❢✳
y = x = |y| =⇒ y = |y| =⇒ y ≥ 0✳
❚❤❡ st❛♥❞❛r❞ ♣r♦♣❛❣❛t♦r ❢♦r x = |y| ✇✐❧❧ ♣r✉♥❡ s♦ t❤❛t x ≥ 0 ❛♥②✇❛②✱ ❛♥❞ t❤✐s
❢❛❝t ❝♦♠❜✐♥❡❞ ✇✐t❤ t❤❡ ❡✈❡♥t x = y ✐ts❡❧❢ s✉❜s✉♠❡s y ≥ 0✳ ❍♦✇❡✈❡r ■ ✇✐❧❧ s❤♦rt❧②
s❤♦✇ t❤❛t t❤❡ ❝♦♥tr❛♣♦s✐t✐✈❡ ♦❢ t❤❡ r✉❧❡ ✐s ✉s❡❢✉❧✳ ❆s ✐t ❤❛♣♣❡♥s✱ t❤❡ ❝♦♥✈❡rs❡ ♦❢
t❤❡ r✉❧❡ ✐s ❛❧s♦ tr✉❡✿
❚❤❡♦r❡♠ ✷✳ ■♥ ❝♦♥str❛✐♥t x = |y|✱ ✐❢ x 6= y t❤❡♥ ✇❡ ❝❛♥ ✐♥❢❡r t❤❛t y < 0✳
Pr♦♦❢✳
|y| = x 6= y =⇒ |y| =
6 y =⇒ y < 0✳
❚❤❡s❡ t❤❡♦r❡♠s s❤♦✇ t❤❛t ❛❜s ❝❛♥ ❜❡♥❡✜t ❢r♦♠ r❡❝❡✐✈✐♥❣ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts✱
❛❧❧♦✇✐♥❣ ✐t t♦ ♣r✉♥❡ ❛❧❧ ♥❡❣❛t✐✈❡ ✭♣♦s✐t✐✈❡✮ ✈❛❧✉❡s ❢r♦♠ t❤❡ ❞♦♠❛✐♥ ♦❢ ②✳ ❈❛♥ ✐t
❛❧s♦ ♣r♦❞✉❝❡ s✉❝❤ ❡✈❡♥ts❄ ■ ♥♦✇ ❡①❤✐❜✐t ❛ s✐♠♣❧❡ ♠❡t❛✲t❤❡♦r❡♠ t♦ s❤♦✇ t❤❛t
❛♥② ❝♦♥str❛✐♥t t❤❛t ❝❛♥ ✉s❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ❝❛♥ ❛❧s♦ ♣r♦❞✉❝❡ t❤❡♠ ❢♦r ♦t❤❡r
❝♦♥str❛✐♥ts t♦ ✉s❡✿
❚❤❡♦r❡♠ ✸✳ ■♥ ❝♦♥str❛✐♥t C ✱ ✐❢ event ❛❧❧♦✇s ✉s t♦ ✐♥❢❡r condition✱ t❤❡♥ ¬condition
❛❧❧♦✇s ✉s t♦ ✐♥❢❡r ¬event✳
Pr♦♦❢✳ ❖♠✐tt❡❞✳
❋♦r ❡①❛♠♣❧❡✱ ❛ r✉❧❡ t♦ ♣r♦❞✉❝❡ ❡✈❡♥ts ❢r♦♠ ❚❤❡♦r❡♠ ✶ ❝❛♥ ❜❡ ♦❜t❛✐♥❡❞✿
❈♦r♦❧❧❛r② ✶✳ ■♥ ❝♦♥str❛✐♥t x = |y|✱ ✐❢ y < 0 t❤❡♥ ✇❡ ❝❛♥ ✐♥❢❡r t❤❛t x 6= y ✳
Pr♦♦❢✳
■♠♠❡❞✐❛t❡ ❢r♦♠ ❚❤❡♦r❡♠s ✶ ❛♥❞ ✸✳
■♥ ♦✉t❧✐♥❡✱ ✐♥ ♠② ♥❡✇ ♣r♦♣❛❣❛t✐♦♥ ❢r❛♠❡✇♦r❦ ❛ ❝♦♥str❛✐♥t ♠❛② ❝❤♦♦s❡ t♦ r❡❝❡✐✈❡
✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ✐♥ ❛❞❞✐t✐♦♥ t♦ t❤❡ ♥♦r♠❛❧ s❡t ♦❢ ♣r✉♥✐♥❣✱ ❛ss✐❣♥♠❡♥t ❛♥❞
❜♦✉♥❞ ❡✈❡♥ts✳ ■t ❝❛♥ ❛❧s♦ ♥♦t✐❢② ♦t❤❡r ❝♦♥str❛✐♥ts t❤❛t ✐t ❤❛s ✐♥❢❡rr❡❞ ❛ ✭❞✐s✲
✮❡q✉❛❧✐t② ❜② ❣❡♥❡r❛t✐♥❣ ❛♥ ❡✈❡♥t✳ ❚❤❡ s♦❧✈❡r ✇✐❧❧
✕ ❛❞❞ t❤❡ ❡✈❡♥t t♦ t❤❡ st♦r❡❀ ♦t❤❡r ❝♦♥str❛✐♥ts ❝❛♥ ❝❤❡❝❦ ❢♦r ✭❞✐s✲✮❡q✉❛❧✐t✐❡s
✭❛♥❛❧♦❣♦✉s t♦ ✈❛r✐❛❜❧❡ ❝❤❡❝❦s ❧✐❦❡ ✐♥❉♦♠❛✐♥✱ ❣❡t▼✐♥✱ ❡t❝✳✮
✕ ♣❛ss ♦♥ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s t♦ ❝♦♥str❛✐♥ts t❤❛t ❤❛✈❡ r❡❣✐st❡r❡❞ ✐♥ ✐♥t❡r❡st ✭❛♥❛❧✲
♦❣♦✉s t♦ st❛t✐❝ ❛♥❞ ✇❛t❝❤❡❞ tr✐❣❣❡rs ♦♥ ✈❛❧✉❡ r❡♠♦✈❛❧s✮❀ ❛♥❞
✕ ♣r♦♣❛❣❛t❡ t❤❡ ✭❞✐s✲✮❡q✉❛❧✐t② ❝♦♥str❛✐♥t ✐ts❡❧❢ ✭❛♥❛❧♦❣♦✉s❧② t♦ ❤♦✇ ♣r♦❞✉❝✐♥❣
t❤❡ ❆❈✺ ❡✈❡♥t x 8 1 r❡♠♦✈❡s ✶ ❢r♦♠ x✬s ❞♦♠❛✐♥✮✳
■ ❤❛✈❡ ❛♥❛❧②s❡❞ t❤❡ ✇❤♦❧❡ s❡t ♦❢ ❝♦♥str❛✐♥ts ✐♥ ♠✐♥✐♦♥✱ ❛♥❞ ❢♦✉♥❞ t❤❛t ❡✈❡r②
s✐♥❣❧❡ ♦♥❡ ❝❛♥ ❜❡♥❡✜t ❢r♦♠ ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥✳ ❚❤❡s❡ ❤❛✈❡ ♥♦t ❜❡❡♥
❝❤♦s❡♥ s♣❡❝✐❛❧❧②✳ ❆ s❡❧❡❝t✐♦♥ ❛r❡ t❛❜✉❧❛t❡❞ ✐♥ ❚❛❜❧❡ ✶ ✇✐t❤♦✉t ♣r♦♦❢✳ ❋♦r ❡①❛♠♣❧❡✱
❚❤❡♦r❡♠s ✶ ❛♥❞ ✷ ❛r❡ r❡♣r♦❞✉❝❡❞ ❛s t❤❡ ✜rst ❛♥❞ s❡❝♦♥❞ ❧✐♥❡s ♦❢ t❤❡ t❛❜❧❡✳ ■
❛♠ ✉♥❛❜❧❡ t♦ ❣✐✈❡ t❤❡ ❝♦♠♣❧❡♠❡♥t❛r② r✉❧❡s ❜② ❚❤❡♦r❡♠ ✸ ❢♦r s♣❛❝❡ r❡❛s♦♥s✳
■ ❛tt❡♠♣t t♦ ♦♥❧② ❣✐✈❡ r✉❧❡s t❤❛t ❛r❡ ♥♦t ❝✉rr❡♥t❧② ❝❛♣t✉r❡❞ ❜② t❤❡ ♠✐♥✐♦♥
♣r♦♣❛❣❛t✐♦♥ ❛❧❣♦r✐t❤♠s✳
78
❈♦♥str❛✐♥t
❉❡✜♥✐t✐♦♥
❈♦♥❞✐t✐♦♥
❊✈❡♥t
◆♦t❡s
❛❜s
x = |y|
x=y
y≥0
❙❡❡ ❚❤❡♦r❡♠ ✶✳ ❙✉❜s✉♠❡❞
❛❜s
x = |y|
x 6= y
y<0
vi 6= vj
x 6= y
x=y
x 6= y
x 6= y
vec[idx] =
e
x=y
❜② ♦t❤❡r ♣r♦♣❛❣❛t✐♦♥✳
❛❧❧❞✐✛❡r❡♥t
♥♦♥❡
❡❧❡♠❡♥t
z = |x − y|
z = |x − y|
x 6= y
z = ⌊x/y⌋
vec[i] = e
z 6= 1
i = idx
❡q
x=y
♥♦♥❡
❞✐✛❡r❡♥❝❡
❞✐✛❡r❡♥❝❡
❞✐s❡q
❞✐✈
❣❝❝
✐♥❡q
✐♥❡q
z=
6 0
z=0
♥♦♥❡
❚♦♦ ❝♦♠♣❧❡① t♦ ❞❡s❝r✐❜❡ ❤❡r❡✱ s❡❡ ❙❡❝t✐♦♥ ✸✳✶✳
x≤y+c
x≤y+c
❧❡①❧❡q
x=y
x=c
❙✐♠✐❧❛r t♦ ❧❡①❧❡q✳
♠♦❞✉❧♦
x = max(y, z)
x = max(y, z)
x = max(y, z)
d/e❂❄ r❡♠ r
♠♦❞✉❧♦
✏
y=z
x 6= y
x 6= z
d=e
e=r
♣♦✇
xy = z
xy = z
x 6= z
x=z
xy = z
xy = z
x=z
x 6= z
♠❛①
♠❛①
c≥0
y≥0
❚♦♦ ❝♦♠♣❧❡① t♦ ❞❡s❝r✐❜❡ ❤❡r❡✱ s❡❡ ❙❡❝t✐♦♥ ✸✳✷✳
❧❡①❧❡ss
♠❛①
❙❡❡ ❚❤❡♦r❡♠ ✷
●❡♥❡r❛t❡ ❛❧❧ ❞✐s❡q✉❛❧✐t✐❡s✳
x=y
x=z
x=y
r=0
y 6= 1
y=
0∨y =1
y=1
y 6= 1
❢❛✐❧
❆❧s♦
❘❡♠❛✐♥❞❡r ♠✉st ❜❡ ✐♥ t❤❡
r❛♥❣❡
♣♦✇
♣r♦❞✉❝t
♣r♦❞✉❝t
t❛❜❧❡
❖▼■❚❚❊❉
x=z
0...e − 1
❆❧s♦
❆❧s♦
y=z
y 6= z
❙❡❡ ❙❡❝t✐♦♥ ✸✳✸
❛♥❞✱ ♦r✱ r❡✐❢②✱ r❡✐❢②✐♠♣❧② ♠❡t❛❝♦♥str❛✐♥ts❀ ♠✐♥❀ ✇❛t❝❤✈❡❝♥❡q❀
♠✐♥✉s❡q❀ ♦❝❝✉rr❡♥❝❡❀ s✉♠❧❡q❀ s✉♠❣❡q❀ ✇❡✐❣❤t❡❞s✉♠
❚❛❜❧❡ ✶✳ Pr♦♣❛❣❛t✐♦♥ r✉❧❡s
✸✳✶
●❧♦❜❛❧ ❝❛r❞✐♥❛❧✐t② ❝♦♥str❛✐♥t
❚❤❡ ❣❧♦❜❛❧ ❝❛r❞✐♥❛❧✐t② ❝♦♥str❛✐♥t ✭●❈❈✮ ❬✽❪ ❣✐✈❡♥ ✈❡❝t♦r ♦❢ ✈❛r✐❛❜❧❡s v1 , . . . , vm ✱
❝♦✉♥t ✈❛r✐❛❜❧❡s c1 , . . . , cn ❛♥❞ ❝♦♥st❛♥ts val1 , . . . , valn ❡♥s✉r❡s t❤❛t t❤❡r❡ ❛r❡ ci
♦❝❝✉rr❡♥❝❡s ♦❢ vali ✐♥ v1 , . . . , vm ✳
❲✐t❤♦✉t ❜❡✐♥❣ s♣❡❝✐✜❝ ❛❜♦✉t t❤❡ ♣r♦♣❛❣❛t✐♦♥ ❛❧❣♦r✐t❤♠✱ ❛❞❞✐t✐♦♥❛❧ ♣r♦♣❛✲
❣❛t✐♦♥ ❝♦✉❧❞ ❜❡ ❛❝❤✐❡✈❡❞ ✐♥ t❤❡ ❢♦❧❧♦✇✐♥❣ ❝❛s❡✱ ❢♦r ❡①❛♠♣❧❡✿
✶✳ ❙❛② t❤❛t dom(v1 )✱ dom(v2 ) ❛♥❞ dom(v3 ) ✇❡r❡ ❛❧❧ 1, 2, 3✱ ❛♥❞ 1 ❞♦❡s♥✬t ❛♣♣❡❛r
✐♥ ❛♥② ♦t❤❡r ❞♦♠❛✐♥s✳
✷✳ ❋✉rt❤❡r♠♦r❡✱ 1 ❤❛s t♦ ❜❡ r❡♣❡❛t❡❞ t✇✐❝❡ ✐♥ v1 , . . . , vm ✳
✸✳ ■t ✐s ❡❛s② t♦ t❡❧❧ t❤❛t ❡✐t❤❡r v1 ♦r v2 ❤❛s t♦ ❜❡ 1✱ ❜② t❤❡ ♣✐❣❡♦♥❤♦❧❡ ♣r✐♥❝✐♣❧❡✳
✹✳ ■❢ ✇❡ ♥♦✇ ❦♥♦✇ t❤❛t v1 = v2 t❤❡♥ ✇❡ ❝❛♥ t❡❧❧ str❛✐❣❤t ❛✇❛② t❤❛t v1 = v2 = 1
❛♥❞ v3 6= 1✳
❉✐s✲❡q✉❛❧✐t✐❡s ❝❛♥ ❛❧s♦ ❜❡ ❡①♣❧♦✐t❡❞✿
79
dom(v1 ) ❛♥❞ dom(v2 ) ❛r❡ ❜♦t❤ 1, 2✱ ❛♥❞ 1 ❞♦❡s♥✬t ❛♣♣❡❛r ❡❧s❡✇❤❡r❡✳
❋✉rt❤❡r♠♦r❡ 1 ❝❛♥ ❜❡ r❡♣❡❛t❡❞ ❡✐t❤❡r ♦♥❝❡ ♦r t✇✐❝❡✳
■❢ ✇❡ ♥♦✇ ❦♥♦✇ t❤❛t v1 6= v2 t❤❡♥ str❛✐❣❤t ❛✇❛② ✇❡ ❦♥♦✇ t❤❛t 1 ❝❛♥✬t ❜❡
✶✳ ❙❛② t❤❛t
✷✳
✸✳
r❡♣❡❛t❡❞ t✇✐❝❡✳
■t ✐s ❝♦♠♣❛r❛t✐✈❡❧② s✐♠♣❧❡ t♦ ✐♥❝♦r♣♦r❛t❡ ❞✐s❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥ ✐♥t♦ t❤❡ ❣❝❝
❛❧❣♦r✐t❤♠✱ ❜② ✉s✐♥❣ ❛♥ ❛❧t❡r♥❛t❡ ♥❡t✇♦r❦ ✢♦✇ ❞❡s✐❣♥ ✐♥st❡❛❞ ♦❢ t❤❡ st❛♥❞❛r❞ ♦♥❡
❞❡s❝r✐❜❡❞ ❜② ❘é❣✐♥ ❬✽❪✳ ■ ❤❛✈❡♥✬t ②❡t ❢♦✉♥❞ ❛♥ ❡✣❝✐❡♥t ❞❡s✐❣♥ t❤❛t ✐♥❝♦r♣♦r❛t❡s
❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥✳ ■ ❛❧s♦ ❤❛✈❡♥✬t tr✐❡❞ t♦ ♠❛❦❡ ❣❝❝ ♣r♦❞✉❝❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s
❛❧t❤♦✉❣❤ ✐t ♠✉st ❜❡ ♣♦ss✐❜❧❡✳
✸✳✷
▲❡①✐❝♦❣r❛♣❤✐❝❛❧ ♦r❞❡r✐♥❣ ❝♦♥str❛✐♥t
❚❤❡ ❧❡①✐❝♦❣r❛♣❤✐❝❛❧ ♦r❞❡r✐♥❣ ❝♦♥str❛✐♥t ❡♥s✉r❡s t❤❛t
100 ≤lex 101
❜✉t
101 lex 100✳
x1 . . . xr ≤lex y1 . . . yr ✱ ❡✳❣✳✱
❚❤❡ st❛♥❞❛r❞ ❧✐t❡r❛t✉r❡ ❛❧❣♦r✐t❤♠ ♦❜t❛✐♥✐♥❣
●❆❈ ✐s ❣✐✈❡♥ ✐♥ ❬✶❪✳ ■t ✇♦r❦s ❜② ♠❛✐♥t❛✐♥✐♥❣ t✇♦ ✐♥❞✐❝❡s
α
❛♥❞
β✳ α
✐s t❤❡ ✐♥❞❡①
s✉❝❤ t❤❛t ❛❧❧ ♠♦r❡ s✐❣♥✐✜❝❛♥t ✐♥❞✐❝❡s ❛r❡ ❛ss✐❣♥❡❞ t♦ t❤❡ s❛♠❡ ✈❛❧✉❡✳
β
✐s t❤❡
♠♦st s✐❣♥✐✜❝❛♥t ✐♥❞❡① s✉❝❤ t❤❛t t❤❡ t❛✐❧ ♦❢ t❤❡ ✈❡❝t♦rs ❢r♦♠ t❤❛t ♣♦s✐t✐♦♥ ♠✉st
✈✐♦❧❛t❡ t❤❡ ❝♦♥str❛✐♥t✳ ❋♦r ❡①❛♠♣❧❡✱ t❤❡ ❢♦❧❧♦✇✐♥❣ t❛❜❧❡ ❣✐✈❡s t❤❡ ❝✉rr❡♥t ❞♦♠❛✐♥s
♦❢ t❤❡ ✈❛r✐❛❜❧❡s ✐♥ t❤❡ ❧❡①❧❡q ❝♦♥str❛✐♥t✳
i
✵
✶
✷
✸
✹
dom(xi ) 2 1, 3, 4 2, 3, 4 1 3, 4, 5
dom(yi ) 2 1 1, 2, 3 0 0, 1, 2
α = 1 ❜❡❝❛✉s❡ x0 ❛♥❞ y0 ❛r❡ ❛ss✐❣♥❡❞ t❤❡ s❛♠❡❀ β = 3 ❜❡❝❛✉s❡ ❛♥②
x3 > y3 ❛♥❞ x4 > y4 ✳ ❚❤❡ ♣r♦♣❛❣❛t♦r ♠❛✐♥t❛✐♥s t❤❡s❡ ✈❛❧✉❡s
α ❛♥❞ β ❛s ❞♦♠❛✐♥s ❛r❡ ♥❛rr♦✇❡❞✳ ■❢ α > β t❤❡ ❝♦♥str❛✐♥t ❢❛✐❧s✳ ■❢ α + 1 = β t❤❡
♣r♦♣❛❣❛t♦r ❡♥❢♦r❝❡s xα < yα ✳ ■❢ α + 1 < β t❤❡♥ t❤❡ ♣r♦♣❛❣❛t♦r ❡♥❢♦r❝❡s xα ≤ yα ✳
■♥ t❤✐s ❝❛s❡
❛ss✐❣♥♠❡♥t ♥♦✇ ❤❛s
■t ✐s q✉✐t❡ ❡❛s② t♦ s❡❡ ❤♦✇ t♦ ✐♥❝♦r♣♦r❛t❡ ✭❞✐s✲✮❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥ ✐♥t♦ t❤❡
❛❧❣♦r✐t❤♠✿ ■❢ ✇❡ ✜♥❞ ♦✉t t❤❛t
♣r♦❞✉❝❡ t❤❡ ❡✈❡♥t
✸✳✸
xα = yα t❤❡♥ α ❝❛♥ ❜❡ ✐♥❝r❡♠❡♥t❡❞✳ ❖♥❝❡ α+1 = β
xα 6= yα ✳
❚❛❜❧❡ ❝♦♥str❛✐♥t
❘♦✉❣❤❧② s♣❡❛❦✐♥❣✱ t❤❡ t❛❜❧❡ ❝♦♥str❛✐♥t ♣r♦♣❛❣❛t♦r ✐♥ ♠✐♥✐♦♥ ✇♦r❦s ❜② ✜♥❞✐♥❣
❛
✈❛❧✐❞
s✉♣♣♦rt✐♥❣ t✉♣❧❡ ❢♦r ❡❛❝❤ ✈❛r✐❛❜❧❡✴✈❛❧✉❡ ♣❛✐r✳ ❚♦ t❛❦❡ ❛❝❝♦✉♥t ♦❢ ✭❞✐s✲
✮❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥ t❤❡ ❝r✐t❡r✐❛ ❢♦r ✏✈❛❧✐❞✐t②✑ ✐s ❝❤❛♥❣❡❞✳ ❇❡❢♦r❡✱ ✐t ✇❛s t❤❛t
❡❛❝❤ ✈❛r✐❛❜❧❡✴✈❛❧✉❡ ♣❛✐r ✐♥ t❤❡ t✉♣❧❡ ✐s ✐♥ ✐ts r❡s♣❡❝t✐✈❡ ❞♦♠❛✐♥✳ ❆❢t❡r✱ t✉♣❧❡s
♠✉st ❛❧s♦ ❝♦♥❢♦r♠s t♦ ❛♥② ❛♥❞ ❛❧❧ ❦♥♦✇♥ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s✳ ❋♦r ❡①❛♠♣❧❡✱ t✉♣❧❡
(x = 1, y = 1, z = 2)
✐s ❞✐s❛❧❧♦✇❡❞ ✐❢ ✇❡ ❦♥♦✇
x 6= y
❛♥❞✴♦r ✐❢ ✇❡ ❦♥♦✇
y = z✳
❈♦♥✈❡rs❡❧②✱ ✐❢ ❛❧❧ t✉♣❧❡s ✇❤♦s❡ ❝♦♠♣♦♥❡♥ts ❛r❡ ✐♥ t❤❡✐r r❡s♣❡❝t✐✈❡ ❞♦♠❛✐♥s
❛r❡ s✉❝❤ t❤❛t ❛ ♣❛rt✐❝✉❧❛r ♣❛✐r ♦❢ ❝♦♠♣♦♥❡♥ts ❛r❡ ❛❧✇❛②s ❡q✉❛❧ ♦r ✉♥❡q✉❛❧ t❤❡♥
t❤❡ ❝♦rr❡s♣♦♥❞✐♥❣ ❡✈❡♥t ❝❛♥ ❜❡ ❣❡♥❡r❛t❡❞✳
✹
80
■♠♣❧❡♠❡♥t❛t✐♦♥
✭❉✐s✲✮❡q✉❛❧✐t✐❡s ❝❛♥ ❜❡ ❤❛♥❞❧❡❞ ✐♥ ❛ s✐♠✐❧❛r ✇❛② t♦ ♦t❤❡r ♣r♦♣❛❣❛t✐♦♥ ❡✈❡♥ts ❧✐❦❡
v ← 1 ♦r v 8 3✳ ◆♦t❡ t❤❛t t❤❡r❡ ❛r❡ ❛ q✉❛❞r❛t✐❝ ♥✉♠❜❡r ♦❢ ❞✐✛❡r❡♥t ✭❞✐s✲✮❡q✉❛❧✐t②
❡✈❡♥ts ♣♦ss✐❜❧❡✱ t❤❡ s❛♠❡ ❛s ✭❞✐s✲✮❛ss✐❣♥♠❡♥t ❛♥❞ ❜♦✉♥❞ ❝❤❛♥❣❡❞ ❡✈❡♥ts✳
❈♦♥str❛✐♥ts s❤♦✉❧❞ ❤❛✈❡ t❤❡ ❢♦❧❧♦✇✐♥❣ ❢❛❝✐❧✐t✐❡s ❛✈❛✐❧❛❜❧❡ t♦ t❤❡♠✿
✕ ❙❡t ✉♣ st❛t✐❝ tr✐❣❣❡rs ♦♥ ❝❤♦s❡♥ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ✭✐❢ t❤❡ ✭❞✐s✲✮❡q✉❛❧✐t② ✐s
❛❧r❡❛❞② tr✉❡✱ ✐t s❤♦✉❧❞ tr✐❣❣❡r ✐♠♠❡❞✐❛t❡❧② ❛❢t❡r ❜❡✐♥❣ s❡t ✉♣✮✳
✕ ●❡♥❡r❛t❡ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ❢♦r ❝❤♦s❡♥ ♣❛✐rs ♦❢ ✈❛r✐❛❜❧❡s✳
✕ ❈❤❡❝❦ ✐❢ ❛♥ ❡q✉❛❧✐t② ✐s tr✉❡✱ ❢❛❧s❡ ♦r ✉♥❦♥♦✇♥✳
✭❉✐s✲✮❡q✉❛❧✐t✐❡s s❤♦✉❧❞ ❜❡ r❡♠♦✈❡❞ ❛❢t❡r ❜❛❝❦tr❛❝❦✳ ■❢ x = y ❛♥❞ x 6= y ❛r❡ ❜♦t❤
❣❡♥❡r❛t❡❞ t❤❡ s♦❧✈❡r s❤♦✉❧❞ ❢❛✐❧ ❛♥❞ ❜❛❝❦tr❛❝❦✳ ■❢ x = y ❛♥❞ y = z ❛r❡ ❣❡♥❡r❛t❡❞✱
t❤❡ s♦❧✈❡r s❤♦✉❧❞ ❡♥s✉r❡ x = z ✐s ❣❡♥❡r❛t❡❞✳
✭❉✐s✲✮❡q✉❛❧✐t✐❡s ♥❡❡❞ ♦♥❧② ❜❡ ❣❡♥❡r❛t❡❞ ❡①♣❧✐❝✐t❧② ❜② ♣r♦♣❛❣❛t♦rs ❛♥❞ ✇❤❡♥
t❤❡② ❛r❡ ♥♦t ♦❜✈✐♦✉s ❢r♦♠ t❤❡ ✈❛r✐❛❜❧❡ ❞♦♠❛✐♥s✳ ❋♦r ❡①❛♠♣❧❡✱ ❡✈❡♥ ✐❢ x = 1 ❛♥❞
y = 1 ❡✈❡♥t x = y ♠❛② ♥♦t r❡s✉❧t✳ ❚❤✐s ♣r♦♣❡rt② ✐s ♥♦t ❡ss❡♥t✐❛❧✱ ❛♥❞ ✐t ✇♦✉❧❞
❜❡ ❛ ❣♦♦❞ ✐❞❡❛ t♦ tr② ❣❡♥❡r❛t✐♥❣ ❛❧❧ ❡✈❡♥ts✳ ❚❤❡ ❛❞✈❛♥t❛❣❡s ❛r❡ ❛s ❢♦❧❧♦✇s✿
✕ ❊✈❡♥ts t❤❛t ❛r❡ ♦❜✈✐♦✉s ❢r♦♠ t❤❡ ✈❛r✐❛❜❧❡ ❞♦♠❛✐♥s s❤♦✉❧❞ ❜❡ ♣✐❝❦❡❞ ✉♣ ✐♥
t❤❡ ♥♦r♠❛❧ ❝♦✉rs❡ ♦❢ ❝♦♥str❛✐♥t ♣r♦♣❛❣❛t✐♦♥✳ ■t ✐s ❜❡tt❡r ❢♦r ✭❞✐s✲✮❡q✉❛❧✐t②
♣r♦♣❛❣❛t✐♦♥ t♦ ❜❡ ♦rt❤♦❣♦♥❛❧ t♦ ♥♦r♠❛❧ ♣r♦♣❛❣❛t✐♦♥ s♦ ✐t ♠❛② ❜❡ ❡❛s✐❧②
s✇✐t❝❤❡❞ ♦✛✳
✕ ❆✈♦✐❞s ✐♥tr♦❞✉❝✐♥❣ ✈❛st ♥✉♠❜❡rs ♦❢ ❡✈❡♥ts ❢♦r ♣r♦❜❧❡♠s ✇✐t❤ s♠❛❧❧ ❞♦♠❛✐♥s
✇❤❡r❡ ❡q✉❛❧✐t✐❡s ❛r❡ ❧✐❦❡❧②✱ ❢♦r ❡①❛♠♣❧❡ ✐♥ ❛ ❜♦♦❧❡❛♥ ♣r♦❜❧❡♠✳ ❆❧s♦✱ ♥♦ ♠❛tt❡r
t❤❡ ❞♦♠❛✐♥s✱ ❛s ♠♦r❡ ✈❛r✐❛❜❧❡s ❛r❡ ❛ss✐❣♥❡❞ ♠♦r❡ s♣✉r✐♦✉s ❡q✉❛❧✐t✐❡s ❛♥❞
❞✐s❡q✉❛❧✐t✐❡s ❛r❡ ♣r♦❞✉❝❡❞✳
❚❤❡ ❞✐s❛❞✈❛♥t❛❣❡ ✐s t❤❛t t❤✐s ✇✐❧❧ ❧❡❛❞ t♦ ❝♦❞❡ ❞✉♣❧✐❝❛t✐♦♥✿ ❛ ♣r♦♣❛❣❛t♦r ♠✐❣❤t
❧✐❦❡ t♦ ✉s❡ ❞✐s❡q✉❛❧✐t② ❛s ❛ tr✐❣❣❡r✱ s♦ t❤❛t ✐t ✇✐❧❧ ♦♥❧② ♣r♦♣❛❣❛t❡ ✇❤❡♥ ✐t r❡❝❡✐✈❡s
s✉❝❤ ❛♥ ❡✈❡♥t✳ ❍♦✇❡✈❡r ✉s✐♥❣ t❤✐s ❢r❛♠❡✇♦r❦ ✐t ❝❛♥♥♦t ❛ss✉♠❡ t❤❛t ✐t ✇✐❧❧ ❣❡t
t❤❡ ✭❞✐s✲✮❡q✉❛❧✐t② tr✐❣❣❡r ❛♥❞ ❤❡♥❝❡ ✐t ♠✉st ♣❧❛❝❡ ♦t❤❡r tr✐❣❣❡rs✳
■t ✇♦✉❧❞ ❜❡ s❡♥s✐❜❧❡ ❛t s♦♠❡ ♣♦✐♥t t♦ ❡✈❛❧✉❛t❡ ✇❤❡t❤❡r ❣❡♥❡r❛t✐♥❣ ❛❧❧ ♣♦ss✐❜❧❡
✭❞✐s✲✮❡q✉❛❧✐t✐❡s ✐s ❛❞✈❛♥t❛❣❡♦✉s✳
■♠♣❧✐❡❞ ❡✈❡♥ts ■❢ ❡✈❡♥ts x = y ❛♥❞ y = z ❛r❡ ❜♦t❤ ♣r♦❞✉❝❡❞✱ t❤❡♥ x = z ✐s
✐♠♣❧✐❡❞✳ ❚❤❡ s♦❧✈❡r s❤♦✉❧❞ ❡♥s✉r❡ t❤❛t t❤❡s❡ ✐♠♣❧✐❡❞ ❡✈❡♥ts ❛r❡ ❝r❡❛t❡❞✱ ❜❡❝❛✉s❡
t❤❡ ✈❛r✐❛❜❧❡ y ♠❛② ♥♦t ❜❡ ❦♥♦✇♥ t♦ ❛ ♣r♦♣❛❣❛t♦rs ✇✐t❤ ❛ tr✐❣❣❡r ♦♥ x = z ✳
■♠♣❧✐❡❞ ❞✐s❡q✉❛❧✐t② ❡✈❡♥ts ❛❧s♦ ❡①✐st✳ ❙t❛♥❞❛r❞ ❛❧❣♦r✐t❤♠s ✉s✐♥❣ t❤❡ ✉♥✐♦♥✲✜♥❞
❞❛t❛ str✉❝t✉r❡ ❝❛♥ ❜❡ ❡①♣❧♦✐t❡❞ t♦ s♦❧✈❡ t❤❡s❡ ♣r♦❜❧❡♠s✱ s❡❡ ❬✼❪✳
❲❤❡♥ t♦ ❡♥❛❜❧❡ P♦t❡♥t✐❛❧❧②✱ ❡♥❛❜❧✐♥❣ ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ ❝♦✉❧❞ ❜❡ ❞❛♠✲
❛❣✐♥❣ t♦ ♣r♦♣❛❣❛t✐♦♥ s♣❡❡❞✱ ✐❢ t❤❡ ❞②♥❛♠✐❝ ❝❤❛r❛❝t❡r✐st✐❝s ♦❢ t❤❡ ♣r♦❜❧❡♠ ❛r❡
♥♦t s✉✐t❛❜❧❡✳ ❇② ✉s✐♥❣ ❛ s✐♠✐❧❛r t❡❝❤♥✐q✉❡ t♦ ❬✾❪ t❤❡ s♦❧✈❡r ❝❛♥ ❞❡t❡❝t ❝❛s❡s ✇❤❡♥
✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ ✇✐❧❧ ❜❡ ✉♥s✉❝❝❡ss❢✉❧✳ ❉❡t❛✐❧s ❛r❡ ♦♠✐tt❡❞ ❞✉❡ t♦ s♣❛❝❡
❝♦♥s✐❞❡r❛t✐♦♥s✳
✺
Pr❡✈✐♦✉s ✇♦r❦
81
❚❤✐s ✐❞❡❛ ✐s s♦ s✐♠♣❧❡ t❤❛t ✐t ✐s ✐♥❡✈✐t❛❜❧❡ t❤❛t s✐♠✐❧❛r ✇♦r❦ ✇✐❧❧ ❡①✐st✳ ❚❤❡ ❛✐♠ ♦❢
t❤❡ ✇♦r❦ ✐s t♦ ❜❡ s✐♠♣❧❡ ❛♥❞ ❡✛❡❝t✐✈❡❀ ❤❡♥❝❡ ■ ❞♦ ♥♦t s❡❡❦ t♦ ❝♦♠♣❡t❡ ✇✐t❤ ♠♦r❡
❣❡♥❡r❛❧ ✐❞❡❛s✳ ❋✉rt❤❡r♠♦r❡✱ ■ ❤❛✈❡ t❛✐❧♦r❡❞ t❤❡ ✐❞❡❛ t♦ ✇♦r❦ ✐♥ ❛ ♣r♦♣❛❣❛t✐♦♥
s♦❧✈❡r✱ s♦ ✐t ✐s ♥♦t ❝♦✐♥❝✐❞❡♥t ✇✐t❤ s✐♠✐❧❛r ✐❞❡❛s ✐♥ ♦t❤❡r t②♣❡s ♦❢ ❛✉t♦♠❛t❡❞
s❡❛r❝❤ ❛♥❞ r❡❛s♦♥✐♥❣✳ ■ ✇♦✉❧❞ ❜❡ ✐♥t❡r❡st❡❞ t♦ ❤❡❛r ♦❢ ❛♥② ♦t❤❡r r❡❧❛t❡❞ ✇♦r❦✳
❍ä❣❣❧✉♥❞ ❬✹❪ ❤❛s ✇♦r❦❡❞ ♦♥ ❛❧❧♦✇✐♥❣ ❛r❜✐tr❛r② ❝♦♥str❛✐♥ts t♦ ❜❡ ♣✉t ✐♥ t❤❡
st♦r❡✳ ❚❤✐s ❝❧❡❛r❧② ❣❡♥❡r❛❧✐s❡s ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥✳ ▼② ✇♦r❦ ✐s ♦♥ ❛ s♣❡❝✐❛❧
❝❛s❡ ♦❢ t❤✐s ✐❞❡❛ ❝❤♦s❡♥ t♦ ❜❡ ❡✣❝✐❡♥t ❛♥❞ ♠♦r❡ ❝♦♠♠♦♥ ✐♥ ♣r❛❝t✐❝❡✳ ❈♦♥str❛✐♥t
❤❛♥❞❧✐♥❣ r✉❧❡ ✭❈❍❘✮ s♦❧✈❡rs ❬✷❪ ❝❛♥ ❡❛s✐❧② ✐♥❝♦r♣♦r❛t❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ✐♥t♦ t❤❡✐r
r✉❧❡s✳ ❍♦✇❡✈❡r ❈❍❘ r✉❧❡s ❛r❡ ♥♦t s✉✐t❛❜❧❡ ❢♦r r❡♣❧❛❝✐♥❣ ❝♦♥str❛✐♥t ♣r♦♣❛❣❛t♦rs
❢♦r r❡❛s♦♥s ♦❢ ❡✣❝✐❡♥❝②✳ ❙♦♠❡ s❛t✐s✜❛❜✐❧✐t② ♠♦❞✉❧♦ t❤❡♦r✐❡s ✭❙▼❚✮ s♦❧✈❡rs ✉s❡
t❤❡ t❤❡♦r② ♦❢ ❡q✉❛❧✐t✐❡s ✇✐t❤ ✉♥✐♥t❡r♣r❡t❡❞ ❢✉♥❝t✐♦♥s ❬✼❪✳ ❆❧t❤♦✉❣❤ ✭❞✐s✲✮❡q✉❛❧✐t②
♣r♦♣❛❣❛t✐♦♥ ❛♣♣❧✐❡s t♦ t❤✐s t❤❡♦r② ❛♥❞ ✐s ❡①♣❧♦✐t❡❞✱ t❤❡ t❤❡♦r② ✐s ❧✐♠✐t❡❞ ✐♥ ✐ts
❡①♣r❡ss✐✈✐t② ❝♦♠♣❛r❡❞ t♦ ✇❤❛t✬s ❛✈❛✐❧❛❜❧❡ ✐♥ ❛ ❈❙P s♦❧✈❡r✳ ❙♦♠❡ ❈❙P s♦❧✈❡rs
❛r❡ ❛❜❧❡ t♦ ✉♥✐❢② ✈❛r✐❛❜❧❡s✱ ♠❡❛♥✐♥❣ t❤❛t ♦♥❝❡ t❤❡② ❛r❡ ❞❡t❡❝t❡❞ t♦ ❜❡ ❡q✉❛❧
t❤❡② ❜❡❝♦♠❡ t❤❡ s❛♠❡ ✈❛r✐❛❜❧❡❀ ❊❝❧✐♣s❡ ✐s ❛♥ ❡①❛♠♣❧❡ ♦❢ s✉❝❤ ❛ s♦❧✈❡r✳ ❚❤✐s ❛❧✲
❧♦✇s ❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ t♦ ❜❡ ✐♠♣❧❡♠❡♥t❡❞✱ ❜✉t ❞♦❡s ♥♦t ✐♥❝❧✉❞❡ ❞✐s❡q✉❛❧✐t✐❡s✳
❚❤❡♦r❡♠ ✸ s❤♦✇s t❤❛t ❜♦t❤ ❛r❡ ♥❡❝❡ss❛r② t♦ ❛❝❤✐❡✈❡ ❢✉❧❧ ♣r♦♣❛❣❛t✐♦♥✳
❘❡❢❡r❡♥❝❡s
✶✳ ❆❧❛♥ ▼✳ ❋r✐s❝❤✱ ❇r❛❤✐♠ ❍♥✐❝❤✱ ❩❡②♥❡♣ ❑✐③✐❧t❛♥✱ ■❛♥ ▼✐❣✉❡❧✱ ❛♥❞ ❚♦❜② ❲❛❧s❤✳ Pr♦♣✲
❛❣❛t✐♦♥ ❛❧❣♦r✐t❤♠s ❢♦r ❧❡①✐❝♦❣r❛♣❤✐❝ ♦r❞❡r✐♥❣ ❝♦♥str❛✐♥ts✳ ❆rt✐❢✳ ■♥t❡❧❧✳✱ ✶✼✵✭✶✵✮✿✽✵✸✕
✽✸✹✱ ✷✵✵✻✳
✷✳ ❚❤♦♠ ❋rü❤✇✐rt❤✳ ❚❤❡♦r② ❛♥❞ ♣r❛❝t✐❝❡ ♦❢ ❈♦♥str❛✐♥t ❍❛♥❞❧✐♥❣ ❘✉❧❡s✳ ❏✳ ▲♦❣✐❝ Pr♦✲
❣r❛♠♠✐♥❣✱ ❙♣❡❝✐❛❧ ■ss✉❡ ♦♥ ❈♦♥str❛✐♥t ▲♦❣✐❝ Pr♦❣r❛♠♠✐♥❣✱ ✸✼✭✶✕✸✮✿✾✺✕✶✸✽✱ ✶✾✾✽✳
✸✳ ■❛♥ P✳ ●❡♥t✱ ❈❤r✐st♦♣❤❡r ❏❡✛❡rs♦♥✱ ❛♥❞ ■❛♥ ▼✐❣✉❡❧✳ ▼✐♥✐♦♥✿ ❆ ❢❛st s❝❛❧❛❜❧❡ ❝♦♥✲
str❛✐♥t s♦❧✈❡r✳ ■♥ ❊❈❆■✱ ♣❛❣❡s ✾✽✕✶✵✷✱ ✷✵✵✻✳
✹✳ ❇❥♦r♥ ❍❛❣❣❧✉♥❞✳ ❆ ❢r❛♠❡✇♦r❦ ❢♦r ❞❡s✐❣♥✐♥❣ ❝♦♥str❛✐♥t st♦r❡s✳ ▼❛st❡r✬s t❤❡s✐s✱
▲✐♥❦♦♣✐♥❣ ❯♥✐✈❡rs✐t②✱ ✷✵✵✼✳
✺✳ P❛s❝❛❧ ❱❛♥ ❍❡♥t❡♥r②❝❦✱ ❨✈❡s ❉❡✈✐❧❧❡✱ ❛♥❞ ❈❤♦❤✲▼❛♥ ❚❡♥❣✳ ❆ ❣❡♥❡r✐❝ ❛r❝✲
❝♦♥s✐st❡♥❝② ❛❧❣♦r✐t❤♠ ❛♥❞ ✐ts s♣❡❝✐❛❧✐③❛t✐♦♥s✳ ❆rt✐✜❝✐❛❧ ■♥t❡❧❧✐❣❡♥❝❡✱ ✺✼✭✷✕✸✮✿✷✾✶✕
✸✷✶✱ ✶✾✾✷✳
✻✳ ❈❤r✐s ❏❡✛❡rs♦♥✱ ❏✉♥❡ ✷✵✵✾✳ P❡rs♦♥❛❧ ❝♦rr❡s♣♦♥❞❡♥❝❡✳
✼✳ ❘✳ ◆✐❡✉✇❡♥❤✉✐s ❛♥❞ ❆✳ ❖❧✐✈❡r❛s✳ ❉❡❝✐s✐♦♥ Pr♦❝❡❞✉r❡s ❢♦r ❙❆❚✱ ❙❆❚ ▼♦❞✉❧♦
❚❤❡♦r✐❡s ❛♥❞ ❇❡②♦♥❞✳ ❚❤❡ ❇❛r❝❡❧♦❣✐❝❚♦♦❧s✳ ✭■♥✈✐t❡❞ P❛♣❡r✮✳ ■♥ ●✳ ❙✉t❝❧✐✛❡ ❛♥❞
❆✳ ❱♦r♦♥❦♦✈✱ ❡❞✐t♦rs✱ ✶✷❤ ■♥t❡r♥❛t✐♦♥❛❧ ❈♦♥❢❡r❡♥❝❡ ♦♥ ▲♦❣✐❝ ❢♦r Pr♦❣r❛♠♠✐♥❣✱ ❆r✲
t✐✜❝✐❛❧ ■♥t❡❧❧✐❣❡♥❝❡ ❛♥❞ ❘❡❛s♦♥✐♥❣✱ ▲P❆❘✬✵✺✱ ✈♦❧✉♠❡ ✸✽✸✺ ♦❢ ▲❡❝t✉r❡ ◆♦t❡s ✐♥ ❈♦♠✲
♣✉t❡r ❙❝✐❡♥❝❡✱ ♣❛❣❡s ✷✸✕✹✻✳ ❙♣r✐♥❣❡r✱ ✷✵✵✺✳
✽✳ ❏❡❛♥✲❈❤❛r❧❡s ❘é❣✐♥✳ ●❡♥❡r❛❧✐③❡❞ ❛r❝ ❝♦♥s✐st❡♥❝② ❢♦r ❣❧♦❜❛❧ ❝❛r❞✐♥❛❧✐t② ❝♦♥str❛✐♥t✳
■♥ ❆❆❆■✴■❆❆■✱ ❱♦❧✳ ✶✱ ♣❛❣❡s ✷✵✾✕✷✶✺✱ ✶✾✾✻✳
✾✳ ❈❤r✐st✐❛♥ ❙❝❤✉❧t❡ ❛♥❞ P❡t❡r ❏✳ ❙t✉❝❦❡②✳ ❉②♥❛♠✐❝ ❛♥❛❧②s✐s ♦❢ ❜♦✉♥❞s ✈❡rs✉s ❞♦♠❛✐♥
♣r♦♣❛❣❛t✐♦♥✳ ■♥ ▼❛r✐❛ ●❛r❝✐❛ ❞❡ ❧❛ ❇❛♥❞❛ ❛♥❞ ❊♥r✐❝♦ P♦♥t❡❧❧✐✱ ❡❞✐t♦rs✱ ❚✇❡♥t②
❋♦✉rt❤ ■♥t❡r♥❛t✐♦♥❛❧ ❈♦♥❢❡r❡♥❝❡ ♦♥ ▲♦❣✐❝ Pr♦❣r❛♠♠✐♥❣✱ ✈♦❧✉♠❡ ✺✸✻✻ ♦❢ ▲❡❝t✉r❡
◆♦t❡s ✐♥ ❈♦♠♣✉t❡r ❙❝✐❡♥❝❡✱ ♣❛❣❡s ✸✸✷✕✸✹✻✱ ❯❞✐♥❡✱ ■t❛❧②✱ ❉❡❝❡♠❜❡r ✷✵✵✽✳ ❙♣r✐♥❣❡r✲
❱❡r❧❛❣✳
82
Tractable Benchmarks
Student: Justyna Petke, Supervisor: Peter Jeavons
Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford,
UK, e-mail: justyna.petke@comlab.ox.ac.uk
Abstract. The general constraint satisfaction problem for variables with
finite domains is known to be NP-complete, but many different conditions have been identified which are sufficient to ensure that classes of
instances satisfying those conditions are tractable, that is, solvable in
polynomial time. Results about tractability have generally been presented in theoretical terms, with little discussion of how these results
impact on practical constraint-solving techniques. In this paper we investigate the performance of several standard constraint solvers on benchmark instances that are designed to satisfy various different conditions
that ensure tractability. We show that in certain cases some existing
solvers are able to automatically take advantage of the problem features
which ensure tractability, and hence solve the corresponding instances
very efficiently. However, we also show that in many cases the existing
pre-processing techniques and solvers are unable to solve efficiently the
families of instances of tractable problems that we generate. We therefore
suggest that such families of instances may provide useful benchmarks
for improving pre-processing and solving techniques. They may also provide good candidates for global constraints that can take advantage of
efficient special-purpose algorithms.
1
Introduction
Software tools for solving finite-domain constraint problems are now freely
available from several groups around the world. Examples include the Gecode
system [2], the G12 [1], and the Minion constraint solvers [10].
One way to drive performance improvements in constraint solvers is to develop challenging benchmark instances. This approach can also help to drive
improvements in the robustness and flexibility of constraint-solving software.
One obvious source of benchmark instances is from practical applications such
as scheduling and manufacturing process organisation. Another source is combinatorial problems such as puzzles and games. The G12 MiniZinc suite includes
several examples.
In this paper we suggest another important source of useful benchmarks
which has not yet been systematically explored: the theoretical study of constraint satisfaction. From the very beginning of the study of constraint programming there has been a strand of research which has focused on identifying
features of constraint problems which make them tractable to solve [6, 8] and
83
this research has gathered pace recently with the discovery of some deep connections between constraint problems and algebra [3, 4], logic [7], and graph and
hypergraph theory [5, 11].
This research has focused on two main ways in which imposing restrictions
on a constraint problem can ensure that it can be tractably solved. The first of
these is to restrict the forms of constraint which are allowed; these are sometimes
known as constraint language restrictions. The second standard approach to
identifying restrictions on constraint problems which ensure tractability has been
to consider restrictions on the way in which the constraints overlap; these are
sometimes referred to as structural restrictions.
In this paper we begin the process of translating from theoretical results in
the literature to concrete families of instances of constraint problems. We obtain
several families which are known to be efficiently solvable by simple algorithms,
but which cause great difficulties for some existing constraint solvers. We argue
that such families of instances provide a useful addition to benchmark suites
derived from other sources, and can provide a fruitful challenge to the developers
of general-purpose solvers.
2
Definitions
In the theoretical literature the (finite-domain) constraint satisfaction
problem (CSP) is typically defined as follows:
Definition 2.1. A instance of the constraint satisfaction problem is specified by
a triple (V, D, C), where
– V is a finite set of variables
– D is a finite set of values (this set is called the domain)
– C is a finite set of constraints. Each constraint in C is a pair (Ri , Si ) where
• Si is an ordered list of ki variables, called the constraint scope;
• Ri is a relation over D of arity ki , called the constraint relation.
Two proposed standard higher-level languages for specifying constraint problems in practice are Zinc [14] and Essence [9]. However, both of these languages
are considered too abstract and too general to be used directly as the input language for current constraint solvers, so they both have more restricted subsets
which are more suitable for solver input: MiniZinc and Essence′ . There exists a
software translation tool, called Tailor [15], which converts from Essence′ specifications to the input language for the Minion solver (or Gecode). Another software
translation tool distributed with the G12/MiniZinc software [1], converts from
MiniZinc to a more restricted language known as FlatZinc, that serves as the input language for the G12 solver; FlatZinc input is also accepted by Gecode. The
FznTini solver, developed by Huang, transforms a FlatZinc file into DIMACS
CNF format and then uses a Boolean Satisfiability problem (SAT) solver, called
TiniSAT, to solve the resulting SAT problem [12].
Definition 2.2. A class of CSP instances will be called tractable if there exists
an algorithm which finds a solution to all instances in that class, or reports that
there are no solutions, whose time complexity is polynomial in the size of the
instance specification.
2
84
3
Max-closed constraints
One of the first non-trivial classes of tractable constraint types described in
the literature is the class of max-closed constraints introduced in [13].
Definition 3.1 ([13]). A constraint (R, S) with relation R of arity r over an ordered domain D is said to be max-closed if for all tuples (d1 , ..., dr ),(d′1 , ..., d′r ) ∈
R we have (max(d1 , d′1 ), ..., max(dr , d′r )) ∈ R.
In particular, one useful form of max-closed constraint is an inequality of the
form a1 X1 + a2 X2 + · · · + ar−1 Xr−1 ≥ ar Xr + c, where the Xi are variables, c
is a constant, and the ai s are non-negative constants [13].
To generate solvable max-closed CSP instances, we selected a random assignment to all of the variables, and then generated random inequalities of the
form above, keeping only those that were satisfied by this fixed assignment. This
ensured that the system of inequalities had at least one solution. To generate
unsolvable max-closed CSP instances, we generated the inequalities without imposing this condition; if the resulting set was solvable, another set was generated.
As with all of the results presented in this paper, we solved the generated problem
instances using Gecode (version 1.3), G12 (version 0.8.1), FznTini, and Minion
(version 0.8RC1) and the times given are elapsed times on a Lenovo 3000 N200
laptop with an Intel Core 2 Duo processor running at 1.66GHz an 2GB of RAM.
These timings exclude the time required to translate the input from MiniZinc
to FlatZinc (for input to Gecode, G12 and FznTini) or from Essence′ to Minion
input format. (In the special case of FznTini, times include the additional time
required to translate from FlatZinc to DIMACS CNF format.) Average times
over three runs with different generated instances were taken, but the variability
was found in all cases to be quite small.
Predictably, FznTini performs very poorly on these inequalities, which it has
to translate into (large) sets of clauses. Standard CSP solvers should do well on
these instances, because the efficient algorithm for solving max-closed instances
is based on achieving arc-consistency, and all standard constraint solvers do this
by default. Our results confirm that the standard CSP solvers do indeed all
perform well on these instances, although the G12 solver was noticeably less
efficient than the other two.
4
0/1/all constraints
Our second example of a language-based restriction ensuring tractability involves the 0/1/all constraints introduced and shown to be tractable in [6].
Definition 4.1 ([6]). Let x1 and x2 be variables. Let A be a subset of possible
values for x1 and B be a subset of possible values for x2 .
– A complete binary constraint is a constraint R(x1 , x2 ) of the form A × B.
– A permutation constraint is a constraint R(x1 , x2 ) which is equal to
{(v, π(x))|x ∈ A} for some bijection π : A → B.
– A two-fan constraint is a constraint R(x1 , x2 ) where there exists v ∈ A and
w ∈ B with R(x1 , x2 ) = (v × B) ∪ (A × w).
3
85
A 0/1/all constraint is either a complete constraint, a permutation constraint,
or a two-fan constraint.
What is particularly interesting about this form of constraint, for our purposes,
is that the efficient algorithm for 0/1/all constraints is based on achieving pathconsistency [6], which is not implemented in standard constraint solvers.
We wrote a generator for satisfiable CSP instances with 0/1/all constraints of
various kinds on n variables. To ensure satisfiability we first generate a random
assignment and then add only those 0/1/all constraints that satisfy the initial
assignment. We generated instances for various choices of the parameter n and
domain size m, and solved these using Gecode, G12, FznTini, and Minion. All
the solvers performed very well, especially Gecode. We also generated unsatisfiable instances with 0/1/all constraints on just a small number of variables,
leaving all other variables unconstrained. FznTini’s results were inconclusive as
the instances were unsatisfiable [12]. The G12 solver reported ‘no solutions’ in
0.2 seconds, but Minion and Gecode could not solve this problem within 15 min.
When the domain size was decreased to two, none of the solvers could verify that
this simple unsatisfiable instance had no solutions within 15 minutes. The problem seems to be the fixed default variable ordering. None of the standard solvers
focus the search on the few variables that are restricted; having no constraint
between two variables is treated in the same way as having a complete constraint. Once the unsatisfiable instances were embedded in satisfiable instances,
the performance of all the solvers was as good as before.
These results suggest that standard CSP solvers can handle random collections of 0/1/all constraints very effectively, even without specialised algorithms.
However, they appear to be poor at focusing search on more highly constrained
regions, which is thought to be one of the strengths of the current generation of
SAT-solvers. This suggests an obvious target for improvement in adapting the
variable ordering to the specific features of the input instance.
5
Bounded-width structures
For our final example, we consider classes of CSP instances which are tractable
because of the way that the constraint scopes are chosen. In other words, we consider structural restrictions.
For any CSP instance, the scopes of all the constraints can be viewed as
the hyperedges of an associated hypergraph whose vertices are the variables.
This hypergraph is called the structure of the CSP instance. One very simple
condition which is sufficient to ensure tractability is to require the structure to
have a tree decomposition [11], with some fixed bound on the maximum number
of vertices in any node of the tree. Such structures are said to have bounded
width. However, the efficient algorithm for CSP instances with bounded width
structures is based on choosing an appropriate variable ordering, and imposing
a level of consistency proportional to the width [7]. None of the standard CSP
solvers incorporate such algorithms, so it is not at all evident whether they can
solve bounded width instances efficiently.
4
86
To investigate this question we wrote a generator for a family of specific
CSP instances with a very simple bounded-width structure. The instances we
generate are specified by two parameters, w and m. They have (mw + 1) ∗ w
variables arranged in groups of size w, each with domain {0, ..., m}. We impose
a constraint of arity 2w on each pair of successive groups, requiring that the
sum of the values assigned to the first of these two groups should be larger than
the sum of the values assigned to the second. This ensures that a solution exists
and satisfies the following conditions: the difference between the sum of values
assigned to each successive group is 1, and the sum of the values assigned to the
last group is zero.
It turns out that the runtimes of Gecode and Minion grow rapidly with
problem size. The runtimes for the G12 solver do not increase so fast for these
specific instances, but if we reverse the inequalities, then they do increase in
the same way, although in this case Gecode and Minion perform much better.
Somewhat surprisingly, FznTini seems to be able to solve all of these instances
fairly efficiently, even though they contain arithmetic inequalities which have to
be translated into fairly large sets of clauses.
It is concluded that an important opportunity to improve the performance of
CSP solvers would be in finding an efficient way of taking advantage of instance
structure by adapting the variable ordering or other aspects of the search process
to the particular instance. Moreover, as the ordering can be set in the input
file, the question arises as to whether those adjustments could be automatically
identified by the translators as part of the pre-processing.
6
Conclusions
We believe that the results presented in this paper have established that the
various ideas about different forms of tractable CSP instances presented in the
theoretical literature can provide a fruitful source of inspiration for the design of
challenging benchmarks. The initial applications of these ideas, presented in the
previous sections, have already identified significant differences between solvers
in their ability to exploit salient features of the problem instances they are given.
There are a number of technical difficulties to overcome in developing useful
benchmark instances. First of all, unlike SAT solvers, there is no standard input
language for CSP solvers. The translation from MiniZinc to FlatZinc, or from
Essence′ to Minion, can sometimes obscure the nature of an essentially simple
problem, and hence badly affect the efficiency of the solution. On the other
hand, the cost of standardisation is a loss of expressive power. We have seen in
Section 3 that translating simple forms of constraints such as linear inequalities
into CNF may be very inefficient, and may lose the important features of the
constraints which guarantee tractability. We suggest that a better awareness of
the factors of a problem specification that can ensure tractability could lead to
better translation software, which ensures that such features are preserved.
Even when they have been successfully captured in an appropriate specification language, and input to a constraint solver, it can be the case that theoretically tractable instances may still be solved very inefficiently in practice. We
have seen in Sections 4, and especially in Section 5, that when the tractability is
5
87
due to a property that requires a higher level of consistency than arc-consistency
to exploit, instances may be very challenging for standard solvers. Finding effective automatic ways to improve the variable orderings and value orderings used
by a solver according to specific relevant features of the input instance seems a
promising first step which has not been sufficiently pursued.
Summing up, in order to improve the performance of constraint solvers we
need effective benchmarks which can explore that performance over a range of
different problem types with different characteristics. One way to systematically
develop such benchmarks is to use the insights from the theoretical study of
constraint satisfaction. Benchmarks derived in this way can be simple enough
to analyse in detail, and yet challenging enough to reveal specific weaknesses in
solver techniques. This paper has begun to explore the potential of this approach.
References
1. G12/MiniZinc
constraint
solver.
Software
available
at
http://www.g12.cs.mu.oz.au/minizinc/download.html.
2. Gecode constraint solver. Software available at http://www.gecode.org/.
3. A. Bulatov, A. Krokhin, and P. Jeavons. Classifying the complexity of constraints
using finite algebras. SIAM Journal on Computing, 34(3):720–742, 2005.
4. A. Bulatov and M. Valeriote. Recent results on the algebraic approach to the CSP.
In Complexity of Constraints, volume 5250 of Lecture Notes in Computer Science,
pages 68–92. Springer, 2008.
5. D. Cohen, P. Jeavons, and M. Gyssens. A unified theory of structural tractability
for constraint satisfaction problems. Journal of Computer and System Sciences,
74:721–743, 2007.
6. M.C. Cooper, D.A. Cohen, and P.G. Jeavons. Characterising tractable constraints.
Artificial Intelligence, 65:347–361, 1994.
7. V. Dalmau, Ph. Kolaitis, and M. Vardi. Constraint satisfaction, bounded treewidth,
and finite-variable logics. In Proceedings of CP’02, volume 2470 of Lecture Notes
in Computer Science, pages 310–326. Springer-Verlag, 2002.
8. E.C. Freuder. A sufficient condition for backtrack-bounded search. Journal of the
ACM, 32:755–761, 1985.
9. A. Frisch, W. Harvey, C. Jefferson, B. Martnez-Hernndez, and I. Miguel. The
essence of ESSENCE: A constraint language for specifying combinatorial problems.
In Proceedings of IJCAI’05, pages 73–88, 2005.
10. I. Gent, C. Jefferson, and I. Miguel. Minion: A fast scalable constraint solver. In
Proceeedings ECAI 2006, pages 98–102. IOS Press, 2006. Software available at
http://minion.sourceforge.net/.
11. M. Grohe. The structure of tractable constraint satisfaction problems. In MFCS
2006, Lecture Notes in Computer Science, 4162, pages 58–72. Springer, 2006.
12. J. Huang. Universal Booleanization of constraint models. In Proceedings of CP’08,
volume 5202 of Lecture Notes in Computer Science, pages 144–158. Springer, 2008.
13. P.G. Jeavons and M.C. Cooper. Tractable constraints on ordered domains. Artificial Intelligence, 79(2):327–339, 1995.
14. N. Nethercote, P. Stuckey, R. Becket, S. Brand, G. Duck, and G. Tack. MiniZinc:
Towards a standard modelling language. In Proceedings of CP’07, volume 4741 of
Lecture Notes in Computer Science, pages 529–543. Springer, 2007.
15. Andrea Rendl. TAILOR - tailoring Essence′ constraint models to constraint solvers.
Software available at http://www.cs.st-andrews.ac.uk/~andrea/tailor/.
6
88
A simple efficient exact algorithm based on independent
set for Maxclique problem
Chu Min Li and Zhe Quan (student)
MIS, EA4290, Université de Picardie Jules Verne, 33 rue St. Leu 80039 Amiens, France
Abstract. In this paper we propose a simple and efficient exact algorithm for the
famous Maxclique problem called iM axClique, using a new and powerful upper bound based on a partition of a graph into independent subsets. Experimental
results show that iM axClique is very fast on DIMACS Maxclique benchmarks
and solves six instances remaining open in [9].
Keywords: Branch-and-Bound, Independent set, Maxclique
1 Introduction
Consider an undirected graph G=(V , E), where V is a set of n vertices {v1 , v2 , ..., vn }
and E is a set of m edges. Edge (vi , vj ) with i 6= j is said to connect vertices vi and vj .
The graph is undirected because we do not distinguish (vi , vj ) from (vj , vi ). A clique
of G is a subset C of V such that every two vertices in C are connected by an edge in
E. The maximum clique problem (Maxclique for short) consists in finding a clique of
G of the largest cardinality. Maxclique is known to be NP-hard and is very important in
many real-world applications.
There are mainly two types of algorithms to solve the Maxclique problem: heuristic
algorithms including greedy construction algorithms and stochastic local search algorithms (see e.g. [8,1,3,5,6,10]), and exact algorithms including branch and bound algorithms (see e.g. [4,9]). Heuristic algorithms are able to solve large and hard Maxclique
problems but are unable to claim the optimality of their solutions. Exact algorithms may
fail to solve large and hard Maxclique problems, but are sure of the optimality of the
solutions they are able to find.
An independent set of a graph is a subset I of V such that any vertex in I is not
connected to any vertex in I. The maximum independent set problem consists in finding
an independent set of G of the largest cardinality. Maxclique is closely related to the
maximum independent set problem, because a clique of G is an independent set of its
complementary graph Ḡ, and vice versa, where Ḡ is defined to be (V , Ē) with Ē={(vi ,
vj ) | (vi , vj ) ∈ V × V , i 6= j and (vi , vj ) 6∈ E)}.
In this paper, we propose a simple and efficient branch and bound algorithm for
Maxclique. We call it iM axClique because it uses a new upper bound based on independent set partition of G. In section 2, we present iM axClique and the new upper
bound. In section 3, we compare iM axClique with the best state-of-the-art exact algorithms [4,9] in our knowledge on the widely used DIMACS Maxclique benchmark1.
The experimental results show that iM axClique is significantly faster and is able to
close 6 of the 14 problems remaining open in [4] and [9]. We conclude in section 4.
1
available at http://cs.hbg.psu.edu/txn131/clique.html
2
Chu Min Li and Zhe Quan (student)
89
Algorithm 1 iMaxClique(G, C, LB), a branch and bound algorithm for Maxclique
Input: A graph G=(V , E), a clique C, and the cardinality LB of the largest clique found so far
Output: A maximum clique of G
begin
if |V |=0 then return C;
U B ← overestimation(G)+|C|;
if LB ≥ U B then return ∅;
select the vertex v of minimum degree from G;
C1 ← iMaxClique(Gv , C∪{v}, LB);
if |C1 | > LB then LB ← |C1 | ;
C2 ← iMaxClique(G\v, C, LB);
if |C1 | ≥ |C2 | then return C1 ; else return C2 ;
end
2 A Branch and Bound Maxclique Solver
A clique C of a graph G=(V , E) is maximal, if no other vertex of G can be added
into C to form a larger clique. C is maximum if no clique of larger cardinality exists
in G. G can have several maximum cliques. Let V ′ be a subset of V , the subgraph G′
of G induced by V ′ is defined as G′ =(V ′ , E ′ ), where E ′ ={(vi , vj ) ∈ E | vi , vj ∈ V ′ }.
Given a vertex v of G, the neighbor vertices of v is denoted by N (v)={v ′ | (v, v ′ ) ∈
E}. The cardinality |N (v)| of N (v) is called the degree of v. Gv denotes the subgraph
induced by N (v), and G\v denotes the subgraph induced by V \{v}. G\v is obtained
by removing v and all edges connecting v from G.
2.1 A Basic branch and bound Maxclique algorithm
The space of all possible cliques of a graph G can be represented by a binary search
tree. A basic branch and bound algorithm for Maxclique compares all these cliques
and outputs a maximum clique, by exploring the search tree in a depth-first manner.
Algorithm 1 shows the pseudo-code of this algorithm, inspired from the branch and
bound algorithm MaxSatz for the MaxSAT problem proposed in [7].
At every node, the algorithm works with a graph G=(V , E), a clique C of the initial
input graph under construction, and the cardinality of the largest clique found so far
(called lower bound LB). At the root, G is the input graph, C is the empty set ∅, and
LB is 0. At the other nodes, G is a subgraph of the initial input graph, C is disjoint from
G but every vertex of G is connected to every vertex of C in the initial input graph. In
other words, adding any clique of G into C forms a clique of the initial input graph.
The aim of the algorithm is to add a clique of the maximum cardinality of G into C.
If G is empty, then a maximal clique C of the initial input graph is found and
returned. Otherwise the algorithm compares LB with the sum of the cardinality of C
and an overestimation of the maximum cardinality of the cliques in G. The sum is called
upper bound (U B).
If LB<U B, the algorithm tries to find a clique of larger cardinality than LB. For
this purpose, it selects a vertex v in G and implicitly partition all cliques in G into
A simple efficient exact algorithm based on independent set for Maxclique problem
90
two sets, each of which corresponds to a branch of the current tree node: all cliques
containing v and all cliques not containing v. The algorithm is then recursively called
for each branch to obtain the largest clique in each set. The cardinality of these two
cliques is compared and the larger one is returned.
If LB≥U B, a larger clique clearly cannot be found from G and C. The algorithm
should then prune the subtree rooted at the current node and backtracks to a higher level
in the search tree.
The estimation of a tight upper bound (UB) of the maximum clique cardinality is
essential for the performance of iM axClique, since a tight upper bound can allow
iM axClique to prune a subtree even for a large G. Algorithm iM axClique uses a
new upper bound presented in the next subsection.
2.2 A new upper bound for MaxClique
The upper bound used in iM axClique is computed as overestimation(G)+|C|, where
the overestimation(G) function is based on the following property:
Proposition 1. Let ω(G) denote the cardinality of the maximum clique of the graph G.
If G can be partitioned into k independent sets, then ω(G) ≤ k.
Proof. Let C be a maximum clique of G and S1 , S2 , ..., Sk be the k independent
sets partitioning G. At most one vertex in an independent set Si can belong to C (i.e.
|Si ∩C|≤1 ∀i), because all vertices of C are connected but vertices in Si are not connected. So the cardinality of C is at most k.
Algorithm 2 shows the pseudo-code of overestimation(G), which works with the
complementary graph Ḡ of G and the iClique(Ḡ, ∅) function to find a maximal independent set of G. Note that a clique in Ḡ is an independent set in G. The function
overestimation(G) repeatedly calls the iClique function to partition G into independent
sets and returns the number of independent sets.
Algorithm 2 overestimation(G), an overestimation of the maximum cardinality of a
clique of G
Input: A graph G=(V , E)
Output: a partition of G into independent sets
begin
overestimation ← 0;
while Ḡ is not empty do
C ← iClique(Ḡ, ∅);
remove every vertex of C and its edges from Ḡ;
overestimation ← overestimation +1;
return overestimation;
end
The iClique function is showed in Algorithm 3. It returns a maximal clique of
G, since every time a vertex v is added into C, all vertices not connected to v are
eliminated from G. Notice that iClique is similar to iM axClique, except that it does
not backtrack, since there is only one recursive call.
3
4
Chu Min Li and Zhe Quan (student)
91
Algorithm 3 iClique(G, C), a simple algorithm to find a maximal clique
Input: A graph G=(V , E), a clique C
Output: a maximal clique of G
begin
if |V |=0 then return C;
select the vertex v of minimum degree from G;
return iClique(Gv , C∪{v});
end
3 Comparative Evaluation of iM axClique
Regin proposed in [9] a branch-and-bound algorithm for Maxclique using an upper
bound based on a matching algorithm. This upper bound roughly corresponds to the
number of independent sets of cardinality 2 in G. The upper bound is fast to compute
and can be used to eliminate some vertices in G after each branching as follows: select a
vertex v and compute the upper bound for the subgraph induced by all neighbors of v, if
the upper bound is smaller or equal to LB, then v is eliminated from G. This so-called
filtering process is applied to a subset of vertices and is intensively used in Regin’s
algorithm. Regin’s algorithm also uses the diving technique in conjunction with a MIP
approach, and a new search strategy to select the next vertex v expanding the current
clique to apply the idea of not set proposed in [2].
Fahle [4] proposed a branch-and-bound algorithm for Maxclique using a upper
bound based on an estimation of the number of colors needed to color G. The drawback
of this upper bound compared with Regin’s approach is the time needed to compute
such a bound [9]. The filtering process is also intensively used to eliminate some vertices after branching. Note that the set of vertices having the same color is necessarily
independent, but not necessarily maximal, which is to be compared with the sequence
of maximal independent sets computed by overestimation(G) showed in Algorithm 2.
We compare in Table 1 iM axClique with Regin’s algorithm and Fahle’s algorithm on the famous DIMACS benchmark for Maxclique, excluding the 14 instances
remaining open in [9] and the very easy instances (those that are solved by all the
three algorithms in less than 1 second), where the runtime for Regin’s algorithm and
Fahle’s algorithm is taken from [9]. The runtime of iM axClique (denoted iM axC
in the table) is obtained on a similar computer as in [9], i.e. a pentium IV of 2.0 Ghz
with 512 Mo of memory under Linux, and includes the reading of the graphs from
the input files which may not be negligible for large graphs. We also show the improvement of iM axClique over Regin’algorithm (gap) computed as (Regin’s runtime–
iM axClique’s runtime)/iM axClique’s runtime.
Except six relatively easy instances (hamming10-2, johnson16-2-4, p-hat5001, p-hat1500-1, san200_0.9_2, and sanr200_0.9, solved within 1 or 4 minutes),
iM axClique is significantly (up to 17 times) faster than Regin’s and Fahle’s algorithms for all the rest of instances in Table 1. It appears that the gain of iM axClique
is more important for larger and denser graphs. For example, iM axClique is 66%
faster than Regin’s algorithm for the brock200 instances and is 118% faster for the
brock400 instances in the average. For graphs with the same number of vertices, it ap-
A simple efficient exact algorithm based on independent set for Maxclique problem
92
Table 1. Runtime in seconds on DIMACS Maxclique benchmarks
Name
brock200_1
brock200_3
brock200_4
brock400_1
brock400_2
brock400_3
brock400_4
hamming10-2
hamming8-4
johnson16-2-4
keller4
p-hat300-2
p-hat300-3
p-hat500-1
p-hat500-2
p-hat500-3
p-hat700-1
p-hat700-2
p-hat1000-1
p-hat1000-2
p-hat1500-1
MANN_a27
san1000
san200_0.7_1
san200_0.9_1
san200_0.9_2
san200_0.9_3
san400_0.5_1
san400_0.7_1
san400_0.7_2
san400_0.7_3
san400_0.9_1
sanr200_0.7
sanr200_0.9
sanr400_0.5
sanr400_0.7
n
200
200
200
400
400
400
400
1024
256
120
171
300
300
500
500
500
700
700
1000
1000
1500
378
1000
200
200
200
200
400
400
400
400
400
200
200
400
400
m
14834
12048
13089
59723
59786
59681
59765
518656
20864
5460
9435
21928
33390
31569
62946
93800
60999
121728
122253
244799
284923
70551
250500
13930
17910
17910
17910
39900
55860
55860
55860
71820
13868
17863
39984
55869
density
0.7454
0.6054
0.6577
0.7484
0.7492
0.7479
0.7489
0.9902
0.6392
0.7647
0.6491
0.4889
0.7445
0.2531
0.5046
0.7519
0.2493
0.4976
0.2448
0.4901
0.2534
0.9901
0.5015
0.7000
0.9000
0.9000
0.9000
0.5000
0.7000
0.7000
0.7000
0.9000
0.6969
0.8976
0.5011
0.7001
ω
21
15
17
27
29
31
33
512
16
8
11
25
36
9
36
50
11
44
10
46
12
126
15
30
70
60
44
13
40
30
22
100
18
42
13
21
Fahle
92.72
2.23
8.18
fail
fail
fail
fail
5.16
6.11
7.91
2.53
3.01
856.67
0.6
203.93
fail
2.67
2,086.63
16.43
fail
119.77
10,348.87
3044.09
1.57
62.61
1930.90
194.96
6.74
425.99
159.72
617.07
7,219.53
24.99
fail
23.09
15,925
Regin
10.72
0.86
2.13
11,340.8
7,910.6
4,477.23
6,051.77
1.04
4.19
3.80
0.5
0.59
40.71
2.30
32.69
12,744.7
6.01
255.79
27.80
16,845.7
480.84
18.48
102.80
0.36
1.04
2.62
182.70
1.19
23.28
67.53
273.23
1,700
4.30
150.08
17.12
3,139.11
iMaxC
6.61
0.42
1.27
5,448.65
2,423.43
3,525.16
2,244.49
49.71
0.65
5.49
0.21
0.46
31.09
0.71
18.38
2,085.79
2.35
143.61
15.71
7,343.64
237.05
17.12
24.24
0.06
0.28
3.35
11.83
0.31
3.48
3.68
41.62
210.62
2.65
164.22
13.92
1,813.68
gap
0.62
1.04
0.67
1.08
2.26
0.27
1.69
*
5.44
*
1.38
0.28
0.30
2.23
0.77
5.11
1.55
0.78
0.76
1.29
1.02
0.08
3.24
5.00
2.71
*
14.44
2.83
5.68
17.35
5.56
7.07
0.62
*
0.22
0.73
pears that the gain of iM axClique grows quickly when the density of the graphs is
high. For example, iM axClique is 77% faster than Regin’s algorithm for p-hat1000-1
(density=0.2448), and is 129% faster for p-hat1000-2 (density=0.4901). Furthermore,
iM axClique is 77% faster than Regin’s algorithm for p-hat500-2 (density=0.5046),
and is more than five times faster for p-hat500-3 (density=0.7519).
When a graph is large and dense, estimating the number of disjoint maximal independent sets appears to give a much tighter upper bound of the maximum cardinality
of a clique than computing a maximal matching as in Regin’s algorithm or computing
an upper bound of the number of colors needed to color a graph as in Fahle’s algorithm, explaining the performance of iM axClique compared with Regin’s and Fahle’s
algorithms for large and dense graphs.
Thanks to the new upper bound, iM axClique solves for the first time to our knowledge 6 instances (over 14) remaining open in [9]. Table 2 summarizes the performance
of iM axClique on these huge and dense instances on a Macpro machine with 2.8 Ghz
5
6
Chu Min Li and Zhe Quan (student)
93
intel Xeon processor with 4 Go of memory using MacOsX system. The 8 other open
instances cannot be solved after one day (24 hours=86400 seconds) of computation.
Table 2. Performance of iM axClique in seconds on hard DIMACS Maxclique benchmarks
Name
n
m
density
ω
runtime
brock800_1
800
207505
0.6493
23
76,242
brock800_2
800
208166
0.6513
24
70,296
brock800_3
800
207333
0.6487
25
50,151
brock800_4
800
207643
0.6497
26
40,013
MANN_a45
1035
533115
0.9963
345
10,921
p-hat700-3
700
183010
0.7480
62
27,377
4 Conclusion
We have presented a simple efficient branch and bound algorithm for Maxclique. The
algorithm is called iM axClique because it uses a new powerful upper bound based on
a partition of a graph into independent sets. The performance of iM axClique is evaluated on the DIMACS Maxclique benchmarks and compared with the best state-of-theart exact algorithms in our knowledge. The results are very encouraging: iM axClique
is significantly faster and has been able to close six instances remaining open in [9].
References
1. R. Battiti and M. Protasi. Reactive local search for the maximum clique problem. Algorithmica, vol. 29, No. 4, pp. 610-637, 2001.
2. C. Bron, J.A.G.M. Kerbosch, and H. J. Schell. Finding cliques in an undirected graph. Tech.
Rep. Technological U. of Eindhoven, The Netherlands.
3. S. Busygin. A New Trust Region Technique for the Maximum Clique Problem. Report
submitted, vailable at http://www.busygin.dp.ua, 2002.
4. T. Fahle. Simple and fast: Improving a branch-and-bound algorithm for maximum clique. In
Proceedings of ESA-2002, pp. 485-498, 2002.
5. A. Grosso, M. Locatelli, F. Della Croce. Combining swaps and node weights in an adaptive
greedy approach for the maximum clique problem. Journal of Heuristics, Vol. 10, pp. 135152, 2004.
6. K. Katayama, A. Hamamoto, H. Narihisa. Solving the maximum clique problem by k-opt
local search. Proc. of the 19th Annual ACM Symposium on Applied Computing (SAC-2004),
Nicosia, Cyprus, Vol. 2, pp. 1021-1025, March 14-17, 2004.
7. C. M. Li, F. Manyà, and J. Planes. New inference rules for Max-SAT. Journal of Artificial
Intelligence Research, 30:321-359, 2007.
8. W. Pullan, H. H. Hoos. Dynamic Local Search for the Maximum Clique Problem. Journal
of Artificial Intelligence Research, Vol. 25, pp. 159-185, 2006.
9. J.-C. Regin. Solving the maximum clique problem with constraint programming. In Proceedings of CPAIOR’03, Springer, LNCS 2883, pp. 634-648, 2003.
10. J. I. Van Hemert. and C. Solnon. A Study into Ant Colony Optimization, Evolutionary
Computation and Constraint Programming on Binary Constraint Satisfaction Problems. In
Proceedings of Evolutionary Computation in Combinatorial Optimization(EvoCOP 2004),
pp. 114-123, 2004.
94
Exploring Local Acyclicity within
Russian Doll Search
Margarita Razgon (PhD student) and
Gregory M. Provan (supervisor)
{m.razgon|g.provan}@cs.ucc.ie
Department of Computer Science, University College Cork, Ireland
Abstract. In this paper we present two algorithms that improve Russian Doll Search (rds) using local acyclicity without making any assumption about the treewidth of a considered wcsp instance. In each iteration
of computing a lower bound, the first algorithm uses one acyclic constraint subgraph as a special case of our method rds-mdac [12], while
the second algorithm explores a number of disjoint acyclic constraint
subgraphs. We empirically demonstrate that the proposed algorithms
outperform rds-mdac and rds over all the problem domains we studied.
Keywords: Weighted Constraint Satisfaction Problem, Russian Doll
Search, Branch-and-Bound, local inconsistency, acyclicity.
1
Introduction
The efficiency of Branch-and-Bound methods for Weighted Constraint Satisfaction Problems strongly depends on the method of computing a lower bound
associated with the current node of the search tree. There are two main approaches to computing lower bounds: (1) based on counting local inconsistencies
of the given constraint network (cn) [14, 5, 6, 8, 2, 7, 3]; and (2) based on Russian Doll Search (rds) [13, 9, 10]. The advantage of the first approach is that it
takes into account the current domains of variables, but the drawback is that
it cannot explore deep interactions between the variables. The rds-based methods have the opposite properties. They do explore deep interactions between
variables, due to computing the optimal solution weights for subproblems of the
given wcsp. However, these weights are computed based on the initial domains
of variables, and hence they may be much smaller than the weights based on the
current domains.
It is natural to consider a hybrid technique that would benefit from advantages of both approaches. Such an algorithm, rds-mdac, has been introduced
by us in [12]. In particular, we partition the set of unassigned variables into
two subsets. Then we evaluate the lower bound of the projection to one of the
subsets using the directed arc-inconsistency counts [14, 5], the projection to the
other subset is evaluated by rds [13].
95
In this paper we generalize this approach by presenting a generic framework
combining the local inconsistency based lower bound evaluation and the rdsbased one. The framework allows to combine rds with the majority of known
consistency propagation algorithms [8, 2, 7, 3]. Then we specify the framework
so that the set of unassigned variables processed by a local inconsistency based
method forms an acyclic sub-cn, so, the optimal solution weight for the wcsp
induced by these variables can be computed in a polynomial time. We call the resulting algorithm rds-tree. rds-tree explores local acyclicity of the constraint
graph of the given wcsp, without making any assumption about its treewidth
and density [4]. We show that rds-tree outperforms rds-mdac and rds even
on quite dense instances of wcsps.
Inspired by the above results, we ask ourselves if it is possible to explore other
types of local consistency where it is not necessary that the algorithm catches
a large acyclic sub-cn of the given wcsp. We answer this question positively.
In particular, we design an algorithm rds-cascade that can benefit from the
existence of a large number of small disjoint acyclic sub-cn (i.e. a cascade of
such sub-cn). We show that rds-cascade even outperforms rds-tree.
The rest of the paper is organized as follows. In Section 2 we give the necessary background. In Section 3 we describe our algorithms rds-tree and rdscascade. In Section 4 we report our experiments.
2
Background
Let Z be a binary Constraint Network (cn) with the set {v1 , . . . , vn } of variables.
We denote the projection of Z to {vi , . . . , vn } by Zi . Let D(vi ) be a domain of
values of vi . An assignment of Z is a pair (vi , vali ) such that vali ∈ D(vi ).
The constraints of Z are represented by the set of conflicting pairs of values (or
conflicts) of the given cn. A set of assignments, at most one for each variable,
is a partial solution of Z. A partial solution that assigns all the variables of
Z is a solution of Z. In a Weighted Constraint Satisfaction Problem (wcsp)
conflicts are assigned with importance weights, a weight of a solution is a sum of
weights of violated conflicts. The task of a wcsp is to find the optimal solution
of Z (having the smallest weight). A wcsp can be represented by its constraint
graph, which contains one vertex per variable and one edge connecting any pair
of vertices whose variables have at least one conflict between their values.
When a wcsp is being solved by Branch-and-Bound (b&b), two numbers LB
and U B are maintained during solving, that are, respectively, a lower and an
upper bound on the optimal solution weight. If LB ≥ U B then b&b backtracks.
Russian Doll Search (rds) [13] is a procedure that performs n successive
searches on nested subproblems, i.e. rds solves a wcsp first for Zn , then for
Zn−1 , Zn−2 , . . . , and finally, for Z1 , for the fixed order v1 , . . . , vn of variables.
Assume that rds solves Z1 , and the current partial solution P assigns variables
v1 , . . . , vi−1 and has weight F . Let v be an unassigned variable and val ∈ D(v).
We denote by Conf (v, val, P ) the sum of weights of conflicts of (v, val) with P .
Let M inConf (v, P ) = minval∈D(v) Conf (v, val, P ). We denote by Connect the
96
sum of M inConf (v, P ) over all unassigned variables. Let Y be the weight of an
optimal solution of Zi obtained before. Then LB(rds) = F + Connect + Y .
Note that Y takes into account only the initial domains of variables. To
consider the current domains as well, let Y be the optimal solution of Zk for
some k > i (i.e. for a subproblem smaller than Zi ). For variables vi , . . . vk−1
we compute the sum of directed arc-inconsistency (DAC) counts, i.e. evaluate
the lower bound according to mdac algorithm [14, 5]. In particular, for each
(vj , val) such that i ≤ j < k − 1 and val ∈ D(vj ), we compute dacvj ,val , the
number of variables vr (r > j) whose current domains conflict with (vj , val).
Then we set DM (vj , val, P ) = Conf (vj , val, P ) + dacvj ,val , M inDM (vj , P ) =
Pk−1
minval∈D(vj ) DM (vj , val, P ), and SumDM = j=i M inDM (vj , P ). Then
LB(rds-mdac) = F +Connect+SumDM +Y is a valid lower bound considered
by our algorithm rds-mdac [12].
3
Exploring local acyclicity within RDS
Consider again the last paragraph of Section 2, and let dacrvj ,val be defined
analogously to dacvj ,val with the only difference that only arc-inconsistent variables vr (r ≥ k) are counted. Let Z ′ be a wcsp induced by current domains of
variables vi , . . . , vk−1 (i.e. weights of conflicts between them are as in an original
wcsp). Let DM (v, val, P ) = Conf (v, val, P ) + dacrv,val be the initial weight of
(v, val) for each assignment in Z ′ . Denote by C an arbitrary LB on the optimal
solution weight of Z ′ (e.g., it can be computed by [8, 2, 7, 3]). Then it can be
shown that LB = F + Connect + C + Y is a valid LB on the optimal weight
of an extension of P . This method is a generic technique of computing a lower
bound in the rds framework. One special case of this method is rds-mdac [12].
We now present another special case.
Let k be the largest possible number such that variables vi , . . . vk−1 induce an acyclic constraint subgraph of the constraint graph of Z, i.e the constraint graph of Z ′ is acyclic. Let C be the optimal weight of a solution of
Z ′ , that can be computed efficiently. We call the algorithm, which computes
LB(rds-tree) = F + Connect + C + Y in this way, rds-tree. If Z ′ involves
all the unassigned variables, we update U B ← min(U B, F + C) and backtrack
immediately. LB(rds-tree) is generally larger then LB(rds-mdac), because
rds-mdac provides only approximation on the weight of Z ′ , while rds-tree
gives us the exact optimal weight of Z ′ . It is not hard to show that the complexity of one iteration of rds-tree is O(ed2 ) (as of rds-mdac), where e is the
number of constraints between the unassigned variables and d is the maximal
domain size. It is also worth mentioning that due to the fixed order of variables
explored by rds-tree (as well as by other rds-based algorithms), the range of
variables being included to Z ′ can be identified at the preprocessing stage.
The kind of local acyclicity explored by rds-tree depends on the existence
of sufficiently large induced acyclic subgraph of the constraint graph of the
given wcsp. Another interesting type of local acyclicity occurs when the constraint graph has many disjoint acyclic subgraphs. This type of local acyclicity
97
is explored in our algorithm, rds-cascade, described below. At the preprocessing stage, rds-cascade partitions all the variables into subsets T1 , . . . , Tt
(i.e. a cascade of sub-cns), so that each subset induces an acyclic subgraph.
The variables are statically ordered is such a way that each subset includes
consecutive variables according to this order. Let dacfv,val be defined analogously to dacv,val , with the only difference being that for each v ∈ Tm , only
variables that belong to Tm+1 , . . . Tt are considered. Let Zm be the wcsp induced by the variables of Tm , where for each assignment (v, val) of v ∈ Zm ,
DM (v, val, P ) = Conf (v, val, P ) + dacfv,val is considered as the initial weight
of this assignment. Let W Tm be the weight of an optimal solution
Pt of Zm (which
can be computed efficiently). Then LB(rds-cascade) = F + m=1 W Tm is a
valid lower bound on the optimal weight of extension of P . Note that the part of
rds does not participate in computing the lower bound. However, rds-cascade
applies rds as a filtering procedure, i.e., when for the given value (v, val) we are
going to decide whether or not to remove this value from the current domain
of v, (v, val) is temporarily added to P and then the lower bound is computed
using rds.
The complexity of rds-cascade is O(ed2 ), where e is a number of constraints
between the unassigned variables and d is the maximal domain size. Indeed, given
up
up
the cascade T1 , . . . , Tt , let e = (e1 + e2 + · · · + et ) + (eup
1 + e2 + · · · + et−1 ),
where em (m ∈ {1, . . . t}) is a number of constraints between variables of Tm
only and eup
m (m ∈ {1, . . . t − 1}) is a number of constraints between variables of
Tm and the following Tm+1 , . . . , Tt . Computing initial weights of values in each
2
2
Tm takes O(eup
i ∗ d ), computing the weight W Tm of each Tm takes O(em ∗ d ),
analogously to rds-tree. Hence the resulting complexity is O((e1 + e2 + · · · +
up
up
2
2
et ) ∗ d2 + (eup
1 + e2 + · · · + et−1 ) ∗ d ) = O(ed ).
4
Experiments
We now compare rds-tree and rds-cascade to rds [13] and rds-mdac [12].
For all the tested algorithms, we order the variables in decreasing order of their
degrees in the constraint graph. We use the static value ordering heuristic, first
choosing the value that the variable had in the optimal assignment found on
the previous subproblem, then choosing the first available value [13]. All the
algorithms are implemented in C++, and the experiments are performed on a
2.66GHz Intel Xeon CPU with 4.4GB RAM using the GNU g++ compiler. We
measure the search efforts in terms of the number of backtracks and the CPUtime. We run these algorithms on two benchmarks: Binary Random Max-CSPs
and Radio Link Frequency Assignment Problems (RLFAP).
For Binary Random Max-CSPs, we generate cns given four parameters [11]:
(1) the number of variables n, (2) the domain size dom, (3) density p1 (the
probability of a constraint between two given variables), (4) tightness p2 (the
probability of a conflict between two values of the given pair of variables). We
fix three parameters (n, dom, p1 ∈ [0.9, 0.5, 0.1]), and vary p2 over the range
[0.65, 1]. For every tuple of parameters, we impose a time limit of 200 seconds,
98
and report results as the average of 50 instances. Due to lack of space, we describe
empirical results, omitting their graphic presentation.
rds-mdac outperforms rds in the number of backtracks for all the considered values of p1 and over the whole range of p2 . The rate of improvement in
backtracks by rds-mdac over rds is increased with decreasing of p1 : 2.5 times
for p1 = 0.9, 3 times for p1 = 0.5, 6 times for p1 = 0.1. The improvement in
the CPU-time by rds-mdac compared with rds is achieved for middle and low
values of density p1 over the whole range of p2 : a reduction is up to 20% for
p1 = 0.5 and up to 5 times for p1 = 0.1.
The comparison of the number of backtracks between rds-mdac, rds-tree,
and rds-cascade is as follows. rds-tree outperforms rds-mdac for middle
and low values of density p1 over the whole range of p2 , achieving a reduction
of up to 15% for p1 = 0.5 and up to 20 times for p1 = 0.1. rds-cascade
outperforms both rds-mdac and rds-tree for all the considered values of p1
over the whole range of p2 . rds-cascade outperforms rds-tree by 70% on
average for p1 = 0.9 and p1 = 0.5, and up to 12 times for p1 = 0.1.
The comparison of the CPU-time is as follows. For high values of density p1 :
both rds-tree and rds-cascade improves rds-mdac by 30% on average over
the whole range of p2 , but only rds-cascade reduces the CPU-time of rds by
15% for high values of p2 starting from p2 = 0.93. For middle and low values of
density p1 : rds-tree outperforms rds-mdac by up to 30% for p1 = 0.5 and 6
times on average for p1 = 0.1, and rds-cascade outperforms rds-tree by up
to 35% on average for p1 = 0.5 and up to 7 times for p1 = 0.1.
Thus, for Binary Random Max-CSPs, the “winner” is rds-cascade, followed
by rds-tree (the second best), rds-mdac (the third) and rds (the fourth).
We also compare the average LB for one iteration in algorithms rds-mdac
and rds-cascade, without backtracking. We choose a quarter of the variables randomly to be a current partial solution P , assign variables in P randomly, then compute LB by corresponding formulas of rds-mdac and rdscascade. The experiments empirically show that LB(rds-cascade) is higher
than LB(rds-mdac) by up to 50% for all the considered values of density (0.9,
0.5, 0.1), and over the whole range of tightness.
The task of RLFAP [1] is to assign frequency (values) to each of the radio links
(variables) in the given communication network in such a way that the links may
operate together without noticeable interference (unary and binary constraints).
For benchmarking we use five CELAR 6 sub-instances (SUB0, SUB1, SUB2,
SUB3, SUB4) [1]. The empirical results are as follows. rds-mdac outperforms
rds significantly in search efforts for all sub-instances. The best result by rdsmdac over rds is achieved for SUB4: the improvement of 84 times in backtracks
and 28 times in the CPU-time. rds-tree outperforms rds-mdac in the search
efforts for some sub-instances (SUB1, SUB3, SUB4), achieving a reduction of up
8 times in backtracks and up to 6 times in the CPU-time. rds-cascade outperforms rds-mdac in the search efforts for all sub-instances, reducing backtracks
up to 10 times and the CPU-time up to 8 times. The best performance in the
99
search efforts among all the algorithms is achieved by rds-cascade (for SUB0,
SUB2, SUB4) or by rds-tree (for SUB1, SUB3).
5
Conclusion
In this paper we present two new algorithms, rds-tree and rds-cascade,
which use local acyclicity for improving the rds, when computing LB. rds-tree
considers the “small” subproblems with unassigned variables only, as rds-mdac
[12], but replaces mdac evaluation of other unassigned variables by exact computation of optimal weights of acyclic sub-cn on these variables. rds-cascade
divides all unassigned variables into clusters that have acyclic constraint subgraphs, computes the optimal weight of each cluster in the polynomial time,
and sums up the obtained optimal weights of clusters. The empirical evaluation
shows that rds-tree and rds-cascade outperform rds-mdac in all search
efforts, pointing towards the conclusion that exact computation of the optimal
weight of even part of variables increases significantly the LB evaluation.
References
1. B.Cabon and S. de Givry and L. Lobjois and T. Schiex and J. P. Warners. Radio
Link Frequency Assignment. Constraints, 4(1):79–89, 1994.
2. M. C. Cooper and T. Schiex Arc consistency for soft constraints. Artif. Intell.,
154(1-2):199–227, 2004.
3. S. de Givry, F. Heras, M. Zytnicki, and J. Larrosa. Existential Arc Consistency:
Getting closer to full arc consistency in weighted CSPs. In IJCAI, pages 84–89,
2005.
4. J. Larrosa and R. Dechter. Boosting Search with Variable Elimination in Constraint Optimization and Constraint Satisfaction Problems. Constraints, 8(3):303–
326, 2003.
5. J. Larrosa and P. Meseguer. Exploiting the Use of DAC in MAX-CSP. In CP,
pages 308–322, 1996.
6. J. Larrosa, P. Meseguer, and T. Schiex. Maintaining Reversible DAC for Max-CSP.
Artif. Intell., 107(1):149–163, 1999.
7. J. Larrosa and T. Schiex. In the quest of the best form of local consistency for
Weighted CSP. In IJCAI, pages 239–244, 2003.
8. J. Larrosa and T. Schiex. Solving weighted CSP by Maintaining Arc Consistency.
Artif. Intell., 159(1-2):1–26, 2004.
9. P. Meseguer and M. Sánchez. Specializing Russian Doll Search. In CP, pages
464–478, 2001.
10. P. Meseguer, M. Sánchez, and G. Verfaillie. Opportunistic Specialization in Russian
Doll Search. In CP, pages 264–279, 2002.
11. P. Prosser. An empirical study of phase transitions in Binary Constraint Satisfaction Problems. Artif. Intell., 81(1-2):81–109, 1996.
12. M. Razgon and G. M. Provan. Adding Flexibility to Russian Doll Search. In
ICTAI, Vol. 1, pages 163–171, 2008.
13. G. Verfaillie, M. Lemaı̂tre, and T. Schiex. Russian Doll Search for Solving Constraint Optimization Problems. In AAAI/IAAI, Vol. 1, pages 181–187, 1996.
14. R. J. Wallace. Directed Arc Consistency Preprocessing. In Constraint Processing,
Selected Papers, pages 121–137, 1995.
100
Optimal Solutions for Conversational
Recommender Systems Based on Comparative
Preferences and Linear Inequalities
Walid Trabelsi1 (student), Nic Wilson1 and Derek Bridge2
1
1
Cork Constraint Computation Centre, 2 Department of Computer Science
University College Cork, Ireland
w.trabelsi@4c.ucc.ie, n.wilson@4c.ucc.ie, d.bridge@cs.ucc.ie
Introduction
Recommenders are automated tools to deliver selective information that matches
personal preferences. These tools have become increasingly important to help
users find what they need from massive amount of media, data and services
currently flooding the Internet.
In (Bridge and Ricci, 07) [1] a conversational product recommender system
is described in which a user repeatedly reconfigures and revises then requests a
query until she finds the product she wants (see Section 2). A key component of
such a system is the pruning of queries that are judged to be less interesting to the
user than other queries. The approach of [1] assumes that the user’s preferences
are determined by a sum of weights, which leads to pruning based on deduction
of linear inequalities (see Section 3). We have developed a new approach for
modelling the user’s preferences, based on comparative preferences statements
(see Section 4), which can be used as an alternative method for pruning less
interesting queries. Such an approach also allows the possibility of reasoning
with a broad range of other input statements, such as conditional preferences,
and tradeoffs.
2
Information Recommendation
In this section we describe the recommender system set up, based on [1].
A given use case is characterized by a set of products. Products are defined
through a set V of n variables representing the features eventually included in
a given product. Each variable Fi is defined on a discrete and finite domain
denoted by D(Fi ). In the following cases we assume every variable has a domain
D(Fi ) = {fi , fi } where fi represents that the feature is present and fi represents
the feature is absent. Thus a product is represented by an assignment to V . For
example, if V = {F1 , F2 }, then the assignment f1 f2 refers to a product containing
the feature F1 but not F2 . The set of assignments to V is written as D(V ).
The user and the system engage in a dialogue. At each point in the dialogue
there is a current query. The user has to choose a new query, which will become
101
the next current query. The system will generate a set of queries for the user to
choose between. There are three stages to producing this set of queries:
1. The set of queries which are close, in a particular sense, to the current query
are generated; these are called the Candidates.
2. Queries which are unsatisfiable are eliminated; the remaining queries are
called the Satisfiables.
3. Queries which are judged to be dominated by (i.e., worse than) another
of the satisfiable queries are eliminated, leaving the Undominated ones. We
consider two approaches for this last stage; the first one is based on deduction
with linear inequalities (see Section 3); the second one uses a new approach
for reasoning with comparative preferences (Section 4).
The user then chooses some element of the Undominated to be the new Current
Query. This choice of one query above the other possible choices is used to induce
information about the user’s preferences; this information is used in Stage 3 for
eliminating queries in future points in the dialogue.
2.1
Generating the Candidates
The candidate queries are chosen, based on the method in [1], to be the set of
queries which are obtained from the current query by one of three moves: Add
concerns adding one feature to the current query, Switch consists of replacing a
feature by another one and Trade considers replacing one feature by two other
features. For instance, if Add(Fi ) is selected then advisor is aware of the user’s
desire to add the feature Fi not yet present. Choosing Switch(Fi , Fj ) means the
user is trying to replace the feature Fi by the feature Fj previously absent. By a
Trade(Fi , Fk , Fl ) action the user wants to replace the feature Fi by the features
Fk and Fl previously absent.
2.2
Checking Satisfiability
The satisfiable queries are those for which there exists a product which has all the
features present in the query (it may also have other features not present in the
query). With the list of products being stored explicitly in a database, this can
be done by checking through the products explicitly. For configurable products,
where the set of products is represented as a set of solutions of a Constraint
Satisfaction Problem, satisfiability of a query can be checked by determining if
the CSP has solutions containing all the features in the query (which can be
checked by checking satisfiability of an augmented CSP). There can still be a
large number of queries for the user to consider. It is desirable to prune queries
which we believe that the user will consider worse (less of interest) than another
of the queries.
102
3
Eliminating Queries Using Deduction of Linear
Inequalities
As described above, we need a method to prune dominated queries, i.e., queries
which we believe will be less preferred by the user to others. We consider two
different approaches for doing this. The first one, which is a slightly amended
version of the approach used in [1], is based on deduction of linear inequalities
(denoted by LP).
It is assumed that the user assigns, for each feature Fi , a certain amount of
utility wi representing the value of that feature being included in the final product. Therefore every user has a profile represented by a weight vector denoted
by w = (w1 , .., wn ). The approach assumes the system does not know the values
of wi it only reasons on them as unknown variables.
Each move performed by the user is considered as feedback and makes the
advisor induce some user preferences between the product features. These preferences will be defined as linear inequalities on the user’s weights. For instance,
the system might infer that one selected feature seems to be more interesting, for the user, than the features not yet present. For example, suppose that
V = {F1 , F2 , F3 }, and the current query contains no features (q = {∅}). The
user chooses new query {F1 } based on the move Add(F1 ), hence preferring this
move to Add(F2 ) and Add(F3 ), suggesting that feature 1 is preferred to feature 2,
and to feature 3. As a consequence the advisor induces the constraints w1 ≥ w2
and w1 ≥ w3 . The structure storing the set of these inequalities is called the
User Model, which in this case equals {w1 ≥ w2 , w1 ≥ w3 }. The User Model is
updated and enriched by new inequalities as soon as the user performs moves.
The advisor can now exploit the User Model to eliminate queries in Satisfiables, leaving the set of optimal queries: all solutions which are not dominated
by another one. For query α let W (α) be the sum of weights of features in α.
With this approach, α dominates β if the set of linear inequalities contained in
the User Model imply the constraint W (α) ≥ W (β). A query β in Satisfiables is
eliminated if there exists some query α in Satisfiables which strictly dominates
β, i.e., such that α dominates β but β does not dominate α.
4
Eliminating Queries Using Reasoning with
Comparative Preferences
We describe here a different approach for defining dominance among queries,
based on an approach for reasoning with comparative preferences (denoted by
CPR).
4.1
Comparative preferences
Comparative preferences are concerned with the relative preference of alternatives, in particular, statements implying that one alternative is preferred to
103
another. Such statements can be relatively easy to reliably elicit: often it is easier to judge that one alternative is preferred to another than it is to allocate
particular grades of preference to the alternatives.
We describe a language of comparative preference statements, based on [2].
Though it is rather a simple language, it is more expressive than the perhaps
best known languages for reasoning with comparative preferences: CP-nets [3],
TCP-nets [4], and the formalism defined in [5].
A language of comparative preference statements. Let P, Q and T be subsets of
V . Let p be an assignment to P , and let q be an assignment to Q. The language
includes the statement p ≥ q||T . The meaning of this statement is that complete
assignment α is preferred to β if α extends p, and β extends q and α and β agree
on set of variables T .
Given a set of comparative preference statements Γ , and two complete assignments α and β, we need to determine if Γ implies that α is preferred to β. With
the standard kind of inference, this problem of determining the preference of an
outcome α over another outcome β has been shown to be PSPACE-complete
[6], even for the case of CP-nets. In our application, efficient deduction is very
important in order for the system to have a quick response time. The standard
kind of inference is also rather conservative. In [2], a less conservative kind of
inference is defined, which allows deduction in polynomial time; we use this
deductive method in approach here.
4.2
Expressing induced preference using comparative preference
statements
In our setting it is always at least as desirable for a product to have a feature
as not. This can be represented by comparative preference statements of the
following forms:
fi ≥ fi ||V − {Fi }.
This is saying that a query α which contains feature fi is at least as preferred
as a query β which is identical to α except for not containing fi .
Consider the situation where the user has chosen to add feature i instead
of feature j, which with the linear inequalities approach, is represented by the
constraint wi ≥ wj . There are a number of different options for generating
preference statements for this situation.
1. Let q be the current query, let q i be the current query q with the feature
fi added. A rather conservative approach is to just model the preference of
feature i over feature j by the preference statement: q i ≥ q j ||∅, which just
expresses a preference for q i over q j .
2. fi ≥ fi ||V − {Fi , Fj } is saying that the presence or not of the feature Fi
is more important than the choice of Fj . Thus, whatever the state of the
feature Fj in the query the user will prefer Fi to be present in the query so
that to be included in the final product.
104
3. fi fj ≥ fi fj ||V − {Fi , Fj }; this is saying that the simultaneous presence of
the feature Fi and absence of the feature Fj is more interesting than the
simultaneous absence of the feature Fi and presence of the feature Fj (all
else being equal). For instance if the user has to choose between two products
having the same features except Fi and Fj knowing the first contains Fi and
not Fj and the second contains Fj and not Fi , she will prefer the first one.
From this list we can easily appreciate the flexibility of comparative preferences when representing the user preferences.
4.3
Preliminary Experimentation and Results
The preliminary experimentation involved computing the pruning rate of the new
approach (CPR) and the one based on deduction of linear inequalities (LP). Tests
consist of engaging simulated users in a dialogue with the two types of advisors
(CPR and LP) and verifying at each step of the dialogue the number of queries
pruned.
We performed several experiments: each one deals with a number (i.e., 100)
of simulated users interacting with an advisor adopting either the LP or CPR
approach to help select optimal queries, by pruning dominated queries. The
products used during tests belong to three databases of hotels. We tested the
performance of the preference forms described above. These tests reveal that
the CPR approach is pruning much more than the LP approach. In the following table, we report the results (expressed as percentages) of three experiments
using the preference forms 1, 2 and 3 for the CPR approach. (Obviously the
LP approach performance is not influenced by the preference form adopted, the
variation in the bottom being due to random variation.)
1
2
3
CPR 87.62 91.23 89.09
LP 72.91 74.05 71.66
The above results show that the advice given by the CPR approach to the
user is more confined than the advice given by the LP approach, e.g., When
applying the first preference form, 87.62% of the available queries are eliminated
by the CPR approach instead of eliminating only 72.91% by the LP approach.
This allows the user to focus only on the most interesting queries according to
her preferences.
5
Discussion
Both the linear inequalities and the comparative preferences approaches have
been implemented, and we are currently in the process of comparing them experimentally, based on three databases of hotels as products. The advice should
105
contain confirmed feasible and non dominated queries so that real users get satisfied as soon as possible since customers are very quickly annoyed and/or bored
[7].
We have enhanced the reliability and efficiency of information recommendation by providing an efficient comparative preference representation and inference engine so that to improve the user preferences elicitation and perform
efficient inference helping the user to navigate to the suitable product.
A potential advantage of the comparative preferences approach is that it
allows also explicit preference statements to be made by the user which are
context-dependent.
We are planning in the future to address the related problem of finding optimal solutions of a Constraint Satisfaction Problem with respect to a set of comparative preference statements. This will involve incorporating our comparative
preferences deduction mechanism in a branch-and-bound search, and developing
specialist pruning mechanisms.
References
1. Bridge, D., Ricci, F.: Supporting product selection with query editing recommendations. In: Proceedings of ACM Recommender Systems 2007. (2007)
2. Wilson, N.: An efficient deduction mechanism for expressive comparative preference
languages. In: Proceedings of the Nineteenth International Joint Conference on
Artificial Intelligence (IJCAI-09). (2009)
3. Boutilier, C., Brafman, R.I., Domshlak, C., Hoos, H., Poole, D.: CP-nets: A tool
for reasoning with conditional ceteris paribus preference statements. Journal of
Artificial Intelligence Research 21 (2004) 135–191
4. Brafman, R., Domshlak, C., Shimony, E.: On graphical modeling of preference and
importance. Journal of Artificial Intelligence Research 25 (2006) 389–424
5. Wilson, N.: Extending CP-nets with stronger conditional preference statements. In:
Proceedings of AAAI-04. (2004) 735–741
6. Goldsmith, J., Lang, J., Truszczyński, M., Wilson, N.: The computational complexity of dominance and consistency in CP-nets. In: Proceedings of IJCAI-05. (2005)
144 –149
7. Raskutti, B., Zukerman, I.: Generating queries and replies during informationseeking interactions. International Journal of Human Computer Studies 47 (1997)
689–734
106
A multithreaded solving algorithm for QCSP+
Jérémie Vautard (student)1 and Arnaud Lallouet2
1
2
Université d’Orléans — LIFO
BP6759, F-45067 Orléans
jeremie.vautard@univ-orleans.fr
Université de Caen-Basse Normandie — GREYC
BP 5186 - 14032 Caen
arnaud.lallouet@info.unicaen.fr
Abstract. This paper presents some ideas about multi-threading QCSP
solving procedures. We introduce a first draft of a multi-threaded algorithm for solving QCSP+ and give some work leads about parallel solving
of quantified problems.
1
Introduction
Quantified constraint satisfaction problems (QCSP) have been studied for several
years, and many search procedures ([7] [5]), consistency definitions([6] [4]) and
propagations algorithms([3] [1]) have been proposed to solve them. However,
while several attempts have been done to make a parallel solver for CSPs, we
have not found any parallel approach for solving quantified problem. This is what
we try to do in this paper. First, we propose a sketch of a quite general framework
for solving QCSP+ problems (introduced in [2]) using a multi-threaded approach.
Then, we discuss about several leads that can be explored in this domain.
2
2.1
The QCSP+ framework
Formalism
Variables, constraints and CSPs.
Let V be a set of variables. Each v ∈ V , has got a domain Dv . For a given
W ⊆ V , we denote DW the set of tuples on W , i.e. the cartesian product of the
domains of all the variables of W .
A constraint c is a pair (W, T ), W being a set of variables ant T ∈ DW a set
of tuples. The constraint is satisfied for the values of the variables of W which
form a tuple of T . If T = ∅, the constraint is said to be empty and can never be
satisfied. On the other hand, a constraint such that T = DW is full and will be
satisfied whatever value its variables take. W and T are also denoted by var(c)
and sol(c).
S A CSP is a set C of constraints. We denote var(C) the set of its variables, i.e.
c∈C var(c) and sol(C) the set of its solutions, i.e. the set of tuples on var(C)
satisfying all the constraints of C. The empty CSP (denoted ⊤) is true whereas
a CSP containing an empty constraint is false and denoted ⊥.
107
Quantified problems.
A quantified set of variables (or qset) is a pair (q, W ) where q ∈ {∀, ∃} is
a quantifier and W a set of variables. We call a prefix a sequence of qsets
[(q0 , W0 ), . . . , (qn−1 , Wn−1 )] such that (i 6= j) → (Wi ∩ Wj = ∅). We denote
Sn−1
var(P ) = i=0 Wi . A QCSP is a pair (P, G) where P is a prefix and G, also
called the goal, is a CSP such that var(C) ∈ var(P ).
A restricted quantified set of variables or rqset is a triple (q, W, C) where
(q, W ) is a qset and C a CSP. A QCSP+ is a pair Q = (P, G) where P is a prefix
Si
of rqsets such that ∀i, var(Ci ) ⊆
j=0 Wj . Moreover, var(G) ⊆ var(P ) still
holds.
Solution.
A QCSP (P, G) where P = [(∃, W0 ), (∀, W1 ), . . . , (∃, Wn )] represents the following logic formula : ∃W0 ∈ DW0 ∀W1 ∈ DW1 . . . ∃Wn G
Thus, a solution of a quantified problem can not be a simple assignment of
all the variables anymore : in fact, the goal has to be satisfied for all values the
universally quantified variables may take. Intuitively, such a problem can be seen
as a “game” where an existential player tries to satisfy all the constraints of G
while a universal player aims at violating one of them, each player assigning the
variables in turn, in the order defined by the prefix. The solution must represent
the strategy that the existential player should adopt to be sure that, whatever its
opponent do, the goal will always be satisfied. This strategy can be represented
as a family of Skolem functions that give a value to an existential variable as a
function of the preceding universal ones, or by the set of every possible scenario
(i.e. total assignment of the variables) of the strategy. In this paper, we use this
later representation and organize the set of scenarios in a tree : a root node
represents the whole problem, then, inductively :
– if the next qset (qi , Wi ) is universal, the current node gets as many sons as
there are tuples in DWi . Each node is tagged vith one of these tuples ;
– if the next qset is existential, the current node gets one unique son, tagged
by an element of DWi .
Thus, each complete branch of this tree coresponds to a total assignment of
the variables of the problem. If every branch of such a tree corresponds to an
assignment satisfying G, then it is indeed a solution of the problem.
QCSP+ restricts the “moves” of each player to assignments that satisfy the
CSP of the rqsets. The logic formula represented is :
∃W0 C0 ∧ (∀W1 C1 → (∃W2 . . . G))
In this case, the notion of solution is the same, except that the restrictions
have to be taken into account : for universal rqsets (∀, Wi , Ci ) the current node’s
sons corresponds to each solution of Ci . For existential rqsets (∃, Wj , Cj ), the
son must be tagged by an element such that the partial branch corresponds to
an assignment that satisfies Cj .
108
2.2
A basic solving procedure
One simple way to solve quantified problems is to adapt the classical backtracking algorithm for CSPs :
– first, perform a propagation algorithm on the problem. If an inconsistency
is detected, return f alse.
– pick up the leftmost quantified set of variables and enumerate the possible
values of its variables, dividing the problem in as many subproblems.
• in the universal case, solve all the subproblems. If one of them is false,
return f alse. Else, group all the corresponding substrategies and return
them.
• in the existential case, solve each problem until one of them does not
return f alse. if such a subproblem exists, create a node containing the
values that led to the corresponding subproblem, attach the substrategy
returned by the subproblem and return the whole. In the other case,
return f alse.
3
Multithreaded solving : a first attempt
The multithreaded solving method we propose is based on a central data-structure
called manager managing several (single threaded) solvers called workers : a partial strategy is maintained, where some nodes do not father a substrategy. Each
of these nodes corresponds to a task that a worker can solve. Formally, a task
consists in a pair (Q, τ ) where Q is a QCSP+ and τ the partial assignment of
the variables of Q corresponding to the branch of the partial strategy where the
task is attached. Once a worker has finished solving a task, it returns its result
(either a sub-strategy or f alse) that will be taken into account by the manager
to update the partial strategy. Then, the worker receives another task to solve.
Whenever the to-do tasks queue empties, the manager sends a signal to one
or several workers to stop its current task, and send a partial result. Such a
result consists in a partial sub-strategy containing “unfinished” nodes which are
as many other pending tasks.
Once the whole problem is solved (i.e. there is neither more to-do task left
nor other working thread), each worker thread is killed, and the result can be
returned. Here is a list of each procedures and signals used in this framework :
Workers.
Each worker is a thread having a very simple main loop. This loop fetches a task
from the Manager, and tries to solve it by calling an internal (single threaded)
Solve method. The Manager can also possibly answer by a WAIT pseudo-task,
which will cause the worker to sleep until a task becomes available (by calling
a special method of the Manager), or by a STOP pseudo-task, which will kill
the thread. Once the solver finishes, its result is sent to the Manager. This main
loop is described in figure 1.
109
A worker is able to catch two signals called Terminate and Send partial. Both
indicates that the search procedure should stop, but the former means that the
task has become useless while the later calls for returning a partial result, along
with a list of remaining “sub-tasks”.
The Solve method can inherit from any original search procedure, but must
be modified in order to take the signals into account. Figure 2 shows an adaptation from a basic QCSP+ solving procedure which can be interrupted by these
signals.
Finally, a worker provides some methods so that other processes know which
subproblem it is working on.
Procedure Main
loop
task = Manager.fetchWork()
if task == STOP then
Exit
else if task == WAIT then
Manager.wait()
else
result = Solve(task)
Manager.returnWork(result)
end if
end loop
Fig. 1. The worker main loop
Manager.
The manager is an object that contains and builds the winning strategy of the
problem. During search, this winning strategy is incomplete and some nodes are
replaced by tasks remaining to solve. We call ToDo this list of remaining tasks.
A Manager is also aware of the list Current Workers of the workers currently
solving a task and maintains a list of sleeping workers that should be waken
when tasks become available for solving. Unless sait otherwise, the Manager’s
methods are called by a worker.
The fetchWork method withdraws a task from the ToDo list and returns
it. If ToDo is empty, then it sends the signal Send partial to one worker from
Current Workers and returns WAIT. If Current Workers is also empty, the search
has ended and therefore, STOP is returned.
The returnWork method attaches the returned sub-strategy and cuts the
branches that are no longer necessary (for example, every brothers of a complete
substrategy of an existential node, or directly the father node of a universal
substrategy if one of the subproblems have been found to be inconsistent). Each
worker that was solving a node on a cut branch are sent the Terminate signal,
and the workers contained in the sleeping list are awoken. Finally, the wait
110
Procedure
Procedure Solve u ([(∀, W, C)|P ′ ], G)
Solve e ([(∃, W, C)|P ′ ], G)
STR := ∅
SC = Set of solutions of C
SC = Set of solutions of C
while SC 6= ∅ do
while SC 6= ∅ do
choose t ∈ SC ; SC = SC − t
choose t ∈ SC ; SC = SC − t
if Signal Terminate then
if Signal Terminate then
return STOP
return STOP
end if
end if
CURSTR := Solve( (P ′ , G)[W ← t] )
CURSTR := Solve( (P ′ , G)[W ← t] )
if CURSTR 6= F ail then
if CURSTR = F ail then
if Signal Send Partial then
return F ail
return
else
S
PartialResult(tree(t,CURSTR),SC )
STR := STR
CURSTR
else
end if
return tree(t,CUR STR)
if Signal Send Partial then
end if
return PartialResult(STR,SC )
end if
end if
end while
end while
return F ail
return STR
Fig. 2. Search procedure.
method records the calling thread in the Sleeping list and puts it in sleeping
mode, until it is waken by the previous method.
4
Work leads
Search heuristics.
The time taken by a search procedure to solve a CSP greatly depends on the
heuristics used to choose which subproblem should be explored first. Unfortunately, most parallel approaches tends to be incompatible with this heuristics,
thus ruining solving performances. QCSP (and QCSP+ ), reduce the alternatives,
as a solver have to follow the order of the prefix, and the impact of the heuristics
used to perform these choices remains unclear, because they were not originally
tailored for QCSPs. In 2008, Verger and Bessière presented in [8] a promising
heuristics for QCSP+ that accelerate solving by several orders of magnitude on
some problems. Parallelizing the search might, as for CSPs, drawn the benefit
of theses heuristics.
Task priority.
In the presented parallel approach, each task could be given a priority, in order
to minimize pointless subproblems solving. The method used to calculate this
priority will most likely the solving time: in fact, solving one given subproblem
might be pointless or not according to the result of another given task, and the
priority given to the tasks should take that into account. For example, it sounds
reasonable to give top priority to leftmost universal nodes as every branch from
a universal node must be verified whatsoever, a single failure cutting the whole
111
subproblem. After that, solving rightmost pending existential tasks should help
finishing to build complete sub-strategies.
Several kind of workers.
The workers run independently from each other, and their communications
with the Manager are not specific to a particular search procedure. Thus, using
the algorithm of figure 2 in the workers is not mandatory. Any procedure able to
return the sub-strategy of a problem is a priori appropriate. However, algorithms
that can not return partial work and generate remaining tasks should not be
used, as they will tend to prematurely dry the ToDo list, bringing other workers
into sleep mode.
5
Conclusion
Solving quantified constraint satisfaction problems in a parallel way is new, and
even this contribution is far from being achieved. thus, while looking interesting,
this still needs to prove its worth. We presented here a quite simple and general
framework that has to be implemented and tested against traditional solvers
before drawing definitive conclusions.
References
1. Marco Benedetti, Arnaud Lallouet, and Jérémie Vautard. Reusing csp propagators
for qcsps. In Francisco Azevedo, Pedro Barahona, François Fages, and Francesca
Rossi, editors, CSCLP, volume 4651 of Lecture Notes in Computer Science, pages
63–77. Springer, 2006.
2. Marco Benedetti, Arnaud Lallouet, and Jérémie Vautard. Qcsp made practical
by virtue of restricted quantification. In Manuela M. Veloso, editor, IJCAI, pages
38–43, 2007.
3. Lucas Bordeaux, Marco Cadoli, and Toni Mancini. Csp properties for quantified
constraints: Definitions and complexity. In Manuela M. Veloso and Subbarao Kambhampati, editors, AAAI, pages 360–365. AAAI Press / The MIT Press, 2005.
4. Lucas Bordeaux and Eric Monfroy. Beyond np: Arc-consistency for quantified constraints. In Pascal Van Hentenryck, editor, CP, volume 2470 of Lecture Notes in
Computer Science, pages 371–386. Springer, 2002.
5. Ian P. Gent, Peter Nightingale, and Kostas Stergiou. Qcsp-solve: A solver for quantified constraint satisfaction problems. In Leslie Pack Kaelbling and Alessandro
Saffiotti, editors, IJCAI, pages 138–143. Professional Book Center, 2005.
6. Peter Nightingale. Consistency for quantified constraint satisfaction problems. In
Peter van Beek, editor, CP, volume 3709 of Lecture Notes in Computer Science,
pages 792–796. Springer, 2005.
7. Guillaume Verger and Christian Bessiere. Blocksolve: a bottom-up approach for
solving quantified csps. In Proceedings of CP’06, pages 635–649, Nantes, France,
2006.
8. Guillaume Verger and Christian Bessiere. Guiding search in qcsp+ with backpropagation. In Proceedings of CP’08, pages 175–189, Sydney, Australia, 2008.