DAA-Unit-5 Notes

DAA NOTES
BY-
Dr. JYOTI SRIVASTAVA
1
UNIT-IV
BACKTRACKING AND BRANCH AND BOUND
GeneralMethod:
Backtracking is used to solve problem in which a sequence of objects is chosen from a
specified set so that the sequence satisfies some criterion. The desired solution is
expressed as an n-tuple (x1, . . . . , xn) where each xi Є S, S being a finite set.
The solution is based on finding one or more vectors that maximize, minimize, or satisfy a
criterion function P (x1, ........... , xn). Form a solution and check at every step if this has
any chance of success. If the solution at any point seems not promising, ignore it. All
solutions requires a set of constraints divided into two categories: explicit and implicit
constraints.
Explicit constraints are rules that restrict each xi to take on values only from a given set
Explicit constraints depend on the particular instance I of problem being
solved. All tuples that satisfy the explicit constraints define a possible
solution space for I.
Implicit constraints are rules that determine which of the tuples in the solution space of I
satisfy the criterion function. Thus, implicit constraints describe the way in
which the xi’s must relate to each other.
For 8-queensproblem:
Explicit constraints using 8-tuple formation, for this problem are S= {1, 2, 3, 4, 5, 6, 7,8}.
The implicit constraints for this problem are that no two queens can be the same (i.e., all
queens must be on different columns) and no two queens can be on the same diagonal.
Backtracking is a modified depth first search of a tree. Backtracking algorithms determine

problem solutions by systematically searching the solution space for the given problem
instance. This search is facilitated by using a tree organization for the solution space.
Backtracking is the procedure where by, after determining that a node can lead to nothing
but dead end, we go back (backtrack) to the nodes parent and proceed with the search on
the next child.
A backtracking algorithm need not actually create a tree. Rather, it only needs to keep
track of the values in the current branch being investigated. This is the way we implement
backtracking algorithm. We say that the state space tree exists implicitly in the algorithm
because it is not actually constructed.
Terminology:
Problem state is each node in the depth first search tree.
solution states are the problem states ‘S’ for which the path from the root node to ‘S’
defines a tuple in the solution space.
Answer states are those solution states for which the path from root node to s defines a
tuple that is a member of the set of solutions.
State space is the set of paths from root node to other nodes. State space tree is the tree
organization of the solution space. The state space trees are called static trees. This
terminology follows from the observation that the tree organizations are independent of
the problem instance being solved. For some problems it is advantageous to use different
tree organizations for different problem instance. In this case the tree organization is
determined dynamically as the solution space is being searched. Tree organizations that
are problem instance dependent are called dynamic trees.
Live node is a node that has been generated but whose children have not yet been
generated.
E-node is a live node whose children are currently being explored. In other words, an E-
node is a node currently being expanded.
Dead node is a generated node that is not to be expanded or explored any further. All
children of a dead node have already been expanded.
Branch and Bound refers to all state space search methods in which all children of an E-
node are generated before any other live node can become the E-node.
Depth first node generation with bounding functions is called backtracking. State
generation methods in which the E-node remains the E-node until it is dead, lead to
branch and bound methods.
N-QueensProblem:
Let us consider, N = 8. Then 8-Queens Problem is to place eight queens on an 8 x 8
chessboard so that no two “attack”, that is, no two of them are on the same row, column,
ordiagonal.All solutions to the 8-queens problem can be represented as 8-tuples (x1, . . . ,
x8), where xi is the column of the ith row where the ithqueen is placed.
The explicit constraints using this formulation are Si = {1, 2, 3, 4, 5, 6, 7, 8}, 1 ≤i≤8
Therefore the solution space consists of 888-tuples.
The implicit constraints for this problem are that no two xi’s can be the same (i.e., all
queens must be on different columns) and no two queens can be on the same diagonal.
This realization reduces the size of the solution space from 88 tuples to 8!Tuples.
The promising function must check whether two queens are in the same column or
diagonal:
Suppose two queens are placed at positions (i, j) and (k, l)Then:
 Column Conflicts: Two queens conflict if their xi values are identical.
 Diagonal conflict: Two queens i and j are on the same diagonal
i – j = k –l.
This implies, j – l = i –k
 Diagonal conflict:
i + j = k +l.
This implies, j – l = k –i
Therefore, two queens lie on the same diagonal if and only if:
|j – l|= |i – k|
Where, j be the column of object in row i for the i thqueen and l be the column of object in
row ‘k’ for the kthqueen.
To check the diagonal clashes, let us take the following tile configuration:
*
In this example, we have:
* *
i 1 2 3 4 5 6 7 8
* xi 2 5 1 8 4 7 3 6
*
Let us consider for the case whether the queens on 3rdrow and 8throw are
conflicting or not. In thiscase (i, j) = (3, 1) and (k, l) = (8, 6)
Therefore:
|j – l|= |i – k | is |1 – 6|= |3 –8 | which is 5 =5
In the above example we have, |j – l|= |i – k| , so the two queens are attacking. This is not
a solution.
Example:
Suppose we start with the feasible sequence 7, 5, 3,1.
Step1:
Add to the sequence the next number in the sequence 1, 2, . . . , 8 not yet used.
Step2:
If this new sequence is feasible and has length 8 then STOP with a solution. If the
new sequence is feasible and has length less then 8, repeat Step1.
Step3:
If the sequence is not feasible, then backtrack through the sequence until we find
the most recent place at which we can exchange a value. Go back to Step 1.
On a chessboard, the solution

4 – QueensProblem:
Let us see how backtracking works on the 4-queens problem. We start with the root node
as the only live node. This becomes the E-node. We generate one child. Let us assume that
the children are generated in ascending order. Let us assume that the children are
generated in ascending order. Thus node number 2 of figure is generated and the path is
now (1). This corresponds to placing queen 1 on column 1. Node 2 becomes the E-node.
Node 3 is generated and immediately killed. The next node generated is node 8 and the
path becomes (1, 3). Node 8 becomes the E-node. However, it gets killed as all its children
represent board configurations that cannot lead to an answer node. We backtrack to node 2
and generate another child, node 13. The path is now (1, 4). The board configurations as
backtracking proceeds is as follows:
1 1 1 1
. . 2 2 2
. . . . 3
(a) (b) (c) (d)
1 1 1 1
2 . . . 2 2
3 3
. . . . . . 4
(e) (f) (g) (h)
The above figure shows graphically the steps that the backtracking algorithm goes through
as it tries to find a solution. The dots indicate placements of a queen, which were tried and
rejected because another queen was attacking.
In Figure (b) the second queen is placed on columns 1 and 2 and finally settles on column
3. In figure (c) the algorithm tries all four columns and is unable to place the next queen
on a square. Backtracking now takes place. In figure (d) the second queen is moved to the
next possible column, column 4 and the third queen is placed on column 2. The boards in
Figure (e), (f), (g), and (h) show the remaining steps that the algorithm goes through until
a solution is found.
= 19, 173, 961nodes
Sum of Subsets:
Given positive numbers wi, 1 ≤ i ≤ n, and m, this problem requires finding all subsets of
wi whose sums are‘m’.All solutions are k-tuples, 1 ≤ k ≤n. Explicit constraints:
xi Є {j | j is an integer and 1 ≤ j ≤n}. Implicit constraints:No two xi can be the same.
The sum of the corresponding wi’s be m.xi < xi+1 , 1 ≤ i < k (total order in indices) to avoid
generating multiple instances of the same subset (for example, (1, 2, 4) and (1, 4, 2)
represent the samesubset).
A better formulation of the problem is where the solution subset is represented bya n-
tuple (x1, , xn) such that xi Є {0,1}.
The above solutions are then represented by (1, 1, 0, 1) and (0, 0, 1,1). For both
the above formulations, the solution space is 2n distinct tuples.
For example, n = 4, w = (11, 13, 24, 7) and m = 31, the desired subsets are(11,
13, 7) and (24,7).The following figure shows a possible tree organization for two possible
formulations of the solution space for the case n =4.
x1=1 1 x1=4
x1=2
x1=3
2 4 5
x2=4 3
x2=2 x2=4
x2=3 x2=4
6 7 8 9 10
11
x3=3
x3=4 x3=4 x3=4
12 1314 15
x4=4 S
A possible solution space organisation for the

Sum of subsets problem.
The tree corresponds to the variable tuple size formulation. The edges are labeled such that
an edge from a level i node to a level i+1 node represents a value for xi. At each node, the
solution space is partitioned into sub - solution spaces. All paths from the root node to any
node in the tree define the solution space, since any such path corresponds to a subset
satisfying the explicit constraints.
The possible paths are (1), (1, 2), (1, 2, 3), (1, 2, 3, 4), (1, 2, 4), (1, 3, 4), (2), (2,3), and so on.
Thus, the left mot sub-tree defines all subsets containing w1, the next subtree defines all
subsets containing w2 but not w1, and soon.
Graph Coloring (for planar graphs):
Let G be a graph and m be a given positive integer. We want to discover whether the
nodes of G can be colored in such a way that no two adjacent nodes have the same color,
yet only m colors are used. This is termed the m-colorabiltiy decision problem. The m-
colorability optimization problem asks for the smallest integer m for which the graph G
can be colored.
Given any map, if the regions are to be colored in such a way that no two adjacent regions
have the same color, only four colors are needed.
For many years it was known that five colors were sufficient to color any map, but no map
that required more than four colors had ever been found. After several hundred years, this
problem was solved by a group of mathematicians with the help of a computer. They
showed that in fact four colors are sufficient for planar graphs.
The function m-coloring will begin by first assigning the graph to its adjacency matrix,
setting the array x [] to zero. The colors are represented by the integers 1, 2, . . . m and the
solutions are given by the n-tuple (x1, x2, . . ., xn), where xi is the color of nodei.
A recursive backtracking algorithm for graph coloring is carried out by invoking the
statement mcoloring(1);
Algorithm mcoloring(k)
// This algorithm was formed using the recursive backtracking schema. The graph is
// represented by its Boolean adjacency matrix G [1: n, 1: n]. All assignments of
// 1, 2,..........., m to the vertices of the graph such that adjacent vertices are assigned
// distinct integers are printed. k is the index of the next vertex to color.
{
repeat
{ // Generate all legal assignments for x[k].
NextValue(k); // Assign to x [k] a legal color.
If (x [k] = 0) then return; // No new color possible
If (k = n)then // at most m colors have been
// used to color the n vertices.

write (x [1:n]);
else mcoloring(k+1);
} until(false);
}
Algorithm Next Value(k)

// x [1] ,.........x [k-1] have been assigned integer values in the range [1, m] such that
// adjacent vertices have distinct integers. A value for x [k] is determined in the range
//[0,m].x[k]Is assigned the next highest numbered color while maintaining distinctness
// from the adjacent vertices of vertex k. If no such color exists, then x [k] is0.
{
repeat
{
x [k]: = (x [k] +1) mod(m+1) // Next highest color.
If (x [k] = 0) then return; // All colors have been used
for j := 1 to n do
{ // check if this color is distinct from adjacent colors

if ((G [k, j] !=0) and (x [k] = x[j]))
// If (k, j) is and edge and if adj. vertices have the same color then break;
}
if (j = n+1) thenreturn; // New color found
}until(false); // Otherwise try to find another color.
}
Example:
Color the graph given below with minimum number of colors by backtracking using state
space tree.
x1
1 3
2
1 2 2 3 1 3 1 2 x2
3 1 x3
1 2 2 3 1 2 2 3 1 3
4 3
Graph
x4
2 3 2 2 3 3 1 3 1 3 1 3 1 1 2 2 1 2
A 4-node graph and all possible 3-colorin gs

Hamiltonian cycles
Let G = (V, E) be a connected graph with n vertices. A Hamiltonian cycle (suggested by

William Hamilton) is a round-trip path along n edges of G that visits every vertex once
and returns to its starting position. In other vertices of G are visited in the order v1, v2, . .
. . . , vn+1, then the edges (vi, vi+1) are in E, 1 <i <n, and the vi are distinct expect for v1
and vn+1, which are equal. The graph G1 contains the Hamiltonian cycle 1, 2, 8, 7, 6, 5, 4,
3, 1. The graph G2 contains no Hamiltonian cycle.
1 2 3 4 1 2 3
8 7 6 5 5 4
GraphG1 GraphG2
Two graphs to illustrate Hamiltoniancycle
The backtracking solution vector (x1, . . . . . xn) is defined so that xi represents the
ithvisited vertex of the proposed cycle. If k = 1, then x1 can be any of the n vertices. To
avoid printing the same cycle n times, we require that x1 = 1. If 1 < k < n, then xk can be
any vertex v that is distinct from x1, x2, . . . , xk–1 and v is connected by an edge to kx-1.
The vertex xn can only be one remaining vertex and it must be connected to both xn-1
andx1.
Using NextValue algorithm we can particularize the recursive backtracking schema to find
all Hamiltonian cycles. This algorithm is started by first initializing the adjacency matrix
G[1: n, 1: n], then setting x[2: n] to zero and x[1] to 1, and then executing Hamiltonian(2).
The traveling salesperson problem using dynamic programming asked for a tour that has
minimum cost. This tour is a Hamiltonian cycles. For the simple case of a graph all of
whose edge costs are identical, Hamiltonian will find a minimum-cost tour if a tour exists.
Algorithm NextValue(k)
// x [1: k-1] is a path of k – 1 distinct vertices . If x[k] = 0, then no vertex has been
// assigned to x [k]. After execution, x[k] is assigned to the next highest numberedvertex
// which does not already appear in x [1 : k – 1] and is connected by an edge to x [k –1].
// Otherwise x [k] = 0. If k = n, then in addition x [k] is connected to x[1].
{
repeat
{
x [k] := (x [k] +1) mod(n+1); //Nextvertex.
If (x [k] = 0) then return;
If (G [x [k – 1], x [k]] !=0)then

{ // Is there an edge?
for j := 1 to k – 1 do if (x [j] = x [k]) then break;
// check for distinctness.
If (j = k) then // If true, then the vertex is distinct.
If ((k < n) or ((k = n) and G [x [n], x [1]] 0))
Then return;
}
} until(false);
}
Algorithm Hamiltonian(k)
// This algorithm uses the recursive formulation of backtracking to find all
theHamiltonian
// cycles of a graph. The graph is stored as an adjacency matrix G [1: n, 1: n]. All cycles
begin
// at node1.
{
repeat
{ // Generate values for x[k].
NextValue(k); //Assign a legal Next value to
x[k]. if (x [k] = 0) then return;
if (k = n) then write (x
[1:n]); else Hamiltonian (k
+1)
} until(false);
}
143
BRANCH AND BOUND
General method:
Branch and Bound is another method to systematically search a solution space. Just like
backtracking, we will use bounding functions to avoid generating subtrees that do not
contain an answer node. However branch and Bound differs from backtracking in two
ways:
1. It has a branching function, which can be a depth first search, breadth first search
or based on bounding function.
2. It has a bounding function, which goes far beyond the feasibility test as a mean to
prune efficiently the search tree.
Branch and Bound refers to all state space search methods in which all children of the E-
node are generated before any other live node becomes the E-node
Branch and Bound is the generalization of both graph search strategies, BFS and D-
search.
 A BFS like state space search is called as FIFO (First in first out) search as
the list of live nodes in a first in first out list (or queue).
 A D search like state space search is called as LIFO (Last in first out)
search as the list of live nodes in a last in first out (or stack).
Definition 1: Live node is a node that has been generated but whose children have not yet
been generated.
Definition 2: E-node is a live node whose children are currently being explored. In other
words, an E-node is a node currently being expanded.
Definition 3: Dead node is a generated node that is not to be expanded or explored any
further. All children of a dead node have already been expanded.
Definition 4: Branch-an-bound refers to all state space search methods in which all
children of an E-node are generated before any other live node can become
the E-node.
144
Least Cost (LC)search:
In both LIFO and FIFO Branch and Bound the selection rule for the next E-node in rigid
and blind. The selection rule for the next E-node does not give any preference to a node
that has a very good chance of getting the search to an answer node quickly.
The search for an answer node can be obtained by using an “intelligent” ranking
function C(.) for live nodes. The next E-node is selected on the basis of this ranking
function. The node x is assigned a rank using:
c(x)=f(h(x))+g(x)
where, c(x) is the cost of x.
h(x) is the cost of reaching x from the root and f(.) is any non-decreasing function.
si g(x) is an estimate of the additional effort needed to reach an answer node from x.
A search strategy that uses a cost function

c(x)=f(h(x))+g(x) to select next
E-node would always choose for its next E-node a live node with least c(.) is is is
known as LC–search (Least Cost search)
g = 0 and f(h(x)) = level of
BFS and D-search are special cases of LC-search.If

node x, then an LC search generates nodes by levels. This is eventually the same as
a BFS. If f(h(x)) = 0 and
g(x)> g(y)when every is a child of x,then the search is
essentially aD-search.
An LC-search coupled with bounding functions is called an LC-branch and bound search
We associate a cost c(x) with each node x in the state space tree. It is not possibleto
easily compute the function c(x). So we compute aestimate c(x)ofc(x).
Control Abstraction for LC-Search:
Let t be a state space tree and c() a cost function for the nodes in t. If x is a node in t, then
c(x) is the minimum cost of any answer node in the subtree with root x. Thus, c(t) is the
cost of a minimum-cost answer node int.
c(.) is used to estimate c().This heuristic should be easy to computer and

Aheuristic
generally has the property that if x is either an answer node or a leaf node, then
c(x)= c(x).
145
LC-search usesc to find an answer node. The algorithm usestwofunctionsLeast()and Add()
to delete and add a live node from or to the list of live nodes, respectively.
Least() finds a live node with least c(). This node is deleted from the list of live nodes
andn returned.
Add(x) adds the new live node x to the list of live nodes. The list of live nodes be
implemented as a min-heap.
Algorithm LCSearch outputs the path from the answer node it finds to the root node
t. This is easy to do if with each node x that becomes live, we associate a field parent
which gives the parent of node x. When the answer node g is found, the path from g to t
can be determined by following a sequence of parent values starting from the current E-
node (which is the parent of g) and ending at node t.
Listnode =record
{
Listnode * next, *parent; float cost;
}
AlgorithmLCSearch(t)
{ //Search t for an answernode
if *t is an answer node then output *t and
return; E:=t; //E-node.
initialize the list of live nodes to be empty;

repeat
{
for each child x of Edo
{
if x is an answer node then output the path from x to t and

return;
Add (x); //x is a new livenode.

(x parent):=E; // pointer for path to root
}
if there are no more live nodes then
146
{
write (“No answer

node”); return;
}
E :=Least();
} until(false);
}
The root node is the first, E-node. During the execution of LC search, this list contains all
live nodes except the E-node. Initially this list should be empty. Examine all the
children of the E-node, if one of the children is an answer node, then the algorithm
outputs the path from x to t and terminates. If the child of E is not an answer node, then it
becomes a live node. It is added to the list of live nodes and its parent field set to E. When
all the children of E have been generated, E becomes a dead node. This happens only if
none of E’s children is an answer node. Continue the search further until no live nodes
found. Otherwise, Least(), by definition, correctly chooses the next E-node and the search
continues from here.
LC search terminates only when either an answer node is found or the entire state space
tree has been generated and searched.
Bounding:
A branch and bound method searches a state space tree using any search mechanism in
which all the children of the E-node are generated before another node becomes the E-
node. We assume that each answer node x has a cost c(x) associated with it and that a
minimum-cost answer node is to be found. Three common search strategies are FIFO,
LIFO, and LC. The three search methods differ only in the selection rule used to obtain
the nextE-node.
good bounding helps to prune efficiently the tree, leading to a faster exploration of the
solution space.
A cost function c(.)such that c(x)<c(x) is used to provide lower bounds on solutions
obtainable from any node x. If upper is an upper bound on the cost of a minimum-cost
solution, then all live nodes x with c(x)>c(x)> upper. The starting value for upper can be
obtained by some heuristic or can be set .
As long as the initial value for upper is not less than the cost of a minimum-cost answer
node, the above rules to kill live nodes will not result in the killing of a live node that can
reach a minimum-cost answer node. Each time a new answer node is found, the value of
upper can be updated.
Branch-and-bound algorithms are used for optimization problems where, we deal directly
only with minimization problems. A maximization problem is easily converted to a
minimization problem by changing the sign of the objective function.
To formulate the search for an optimal solution for a least-cost answer node in a state
space tree, it is necessary to define the cost function c(.), such that c(x) is minimum for all
147
nodes representing an optimal solution. The easiest way to do this is to use the objective
function itself forc(.).
 For nodes representing feasible solutions, c(x) is the value of the objective
function for that feasible solution.
 For nodes representing infeasible solutions, c(x) =0.
 For nodes representing partial solutions, c(x) is the cost of the minimum-cost node
in the subtree with root x.
Since, c(x) is generally hard to compute, the branch-and-bound algorithm will use an
Estimate (x) such that(x)<c(x)for all x.
Sum-of-Subsets problem
 In this problem, we are given a vector of N values, called weights. The weights are
usually given in ascending order of magnitude and are unique.
 For example, W= (2, 4, 6, 8, 10) is a weight vector. We are also given a value M, for
example 20.
 The problem is to find all combinations of the weights that exactly add to M.
 In this example, the weights that add to 20 are:
(2, 4, 6, 8); (2, 8, 10); and (4, 6, 10).
 Solutions to this problem are often expressed by an N-bit binary solution vector, X,
where a 1 in position i indicates that Wi is part of the solution and a 0 indicates it is
not.
 In this manner the three solutions above could be expressed as: (1,1,1,1,0);
(1,0,0,1,1); (0,1,1,0,1)
Sum-of-Subsets problem
 We are given ‘n’ positive numbers called weights and we have to find all
combinations of these numbers whose sum is M. this is called sum of subsets
problem.
 If we consider backtracking procedure using fixed tuple strategy , the elements X(i)
of the solution vector is either 1 or 0 depending on if the weight W(i) is included or
not.
 If the state space tree of the solution, for a node at level I, the left child corresponds
to X(i)=1 and right to X(i)=0.
Sum of Subsets Algorithm

void SumOfSub(float s, int k, float r)
{
// Generate left child. x[k]
= 1;
if (s+w[k] == m)
{ for (int j=1; j<=k; j++) 148
Print (x[j] )
}
else if (s+w[k]+w[k+1] <= m)
SumOfSub(s+w[k], k+1, r-w[k]);
// Generate right child and evaluate
if ((s+r-w[k] >= m) && (s+w[k+1] <= m)) { x[k] = 0;
SumOfSub(s, k+1, r-w[k]);
}
}
Sum of Subsets State Space Tree
Example n=6, w[1:6]={5,10,12,13,15,18},m=30
Branch and Bound Principal
 The term branch-and-bound refers to all state space search methods in which all
children of the £-node are generated before any other live node can become the £-
node.
 We have already seen two graph search strategies, BFS and D-search, in
which the exploration of a new node cannot begin until the node currently
being explored is fully explored.
 Both of these generalize to branch-and-bound strategies.
 In branch-and-bound terminology, a BFS-like state space search will be called FIFO
(First In First Out) search as the list of live nodes is a first-in-first-out list (or queue).
 A D-search like state space search will be called LIFO (Last In First Out) search as
the list of live nodes is a last-in-first-out list (or stack).
Control Abstraction for Branchand Bound(LC Method)
LC Method Control AbstractionExplanation
 The search for an answer node can often be speeded by using an "intelligent"
ranking function, c(. ), for live nodes.
 The next £-node is selected on the basis of this rankingfunction.
 Let T be a state space tree and c( ) a cost function for the nodes in T.If X is a node
in T then c(X) is the minimum cost of any answer node in the subtree with root X.
Thus, c(T) is the cost of a minimum cost answer node
 The algorithm uses two subalgorithms LEAST(X) and ADD(X) to respectively
delete and add a live node from or to the list of live nodes.
 LEAST{X) finds a live node with least c( ). This node is deleted from the list of
live nodes and returned in variable X.
 ADD(X) adds the new live node X to the list of livenodes.
 Procedure LC outputs the path from the answer
The 0/1 knapsack problem
The 0/1 knapsack problem
The Branching Mechanism in the Branch-and-Bound Strategy to Solve 0/1KnapsackProblem.
How to find the upper bound?
Ans: by quickly finding a feasible solution in a greedy manner: starting from the
smallest available i, scanning towards the largest i’s until M is exceeded. The
upper bound can be calculated.
How to find the ranking Function
How to expand the tree?
 By the best-first searchscheme

 That is, by expanding the node with the best lower bound. If two nodes have the
same lower bounds, expand the node with the lower upper bound.
0/1 Knapsack algorithm using BB
0/1 Knapsack Example usingLCBB (Least Cost)

 Example (LCBB)
 Consider the knapsackinstance:
 n = 4;
 (pi, p2,p3, p4) = (10, 10, 12, 18);
 (wi. w2, w3, w4) = (2, 4, 6, 9)and

 M = 15.
0/1 Knapsack State Space tree ofExample using LCBB
0/1 Knapsack State Space tree ofExample using FIFO BB

The traveling salesperson problem
Given a graph, the TSP Optimization problem is to find a tour, starting from
any vertex, visiting every other vertex and returning to the starting vertex,
with minimalcost.
The basic idea
There is a way to split the solution space (branch)

There is a way to predict a lower bound for a class of solutions. There is also a way tofind a
upper bound of an optimal solution. If the lower bound of a solution exceeds the upper
bound, this solution cannot be optimal and thus we should terminate the branching
associated with thissolution.
Example- TSP
Example with Cost Matrix(a) and its Reduced Cost Matrix (b)
Reduced matrix means every row and column of matrix should contain at least one
Zero and all other entries should be non negative.
Reduced Matrix for node 2,3…10 ofState Space tree using LC Method
State Space tree of Example using LCMethod

UNIT-V
NP –HARD AND NP-COMPLETE PROBLEMS.
Basic concepts:
NP Nondeterministic Polynomial time.

The problems has best algorithms for their solutions have “Computing times”, that cluster into
two groups.
Group-1
Problems with solution time bound by a polynomial of a small degree. They are also
called “Tractable Algorithms”. Most Searching & Sorting algorithms are polynomial time
algorithms. Ex: Ordered Search (O (log n)), Polynomial evaluation O(n)
Sorting O(n.log n)
Group-II
Problems with solution times not bound by polynomial (simply non polynomial ). These
are hard or intractable problems. None of the problems in this group has been solved by any
polynomial time algorithm. Ex: Traveling Sales Person O(n2 2n), Knapsack O(2n/2)
No one has been able to develop a polynomial time algorithm for any problem in the
group –II. So, it is compulsory and finding algorithms whose computing times are greater
than polynomial very quickly because such vast amounts of time to execute that even
moderate size problems cannot be solved.
Theory of NP-Completeness:
Show that may of the problems with no polynomial time algorithms are computational time
algorithms are computationally related.
There are two classes of non-polynomial time problems
1. NP-Hard
2. NP-Complete
NP Complete Problem: A problem that is NP-Complete can solved in polynomial time if and
only if (iff) all other NP-Complete problems can also be solved in polynomial time.
NP-Hard: Problem can be solved in polynomial time then all NP-Complete problems can be
solved in polynomial time.
All NP-Complete problems are NP-Hard but some NP-Hard problems are not know to be NP-
Complete.
Nondeterministic Algorithms:
Algorithms with the property that the result of every operation is uniquely defined are termed
as deterministic algorithms. Such algorithms agree with the way programs are executed on a
computer.
Algorithms which contain operations whose outcomes are not uniquely defined but are limited
to specified set of possibilities. Such algorithms are called nondeterministic algorithms.
The machine executing such operations is allowed to choose any one of these outcomes
subject to a termination condition to be defined later.
To specify nondeterministic algorithms, there are 3 new functions.
Choice(S) arbitrarily chooses one of the elements of sets S
Failure () Signals an Unsuccessful completion
Success () Signals a successful completion.
Example for Non Deterministic algorithms:
Algorithm Search(x){
//Problem is to search an element x
//output J, such that A[J]=x; or J=0 if x is not in A
J:=Choice(1,n);
if( A[J]:=x) then
{
Write(J);
Success();
}
else{
write(0);
failure();
}
Whenever there is a set of choices that leads to a successful completion then one such set of
choices is always made and the algorithm terminates.
A Nondeterministic algorithm terminates unsuccessfully if and only if (iff) there exists no set
of choices leading to a successful signal.
Nondeterministic Knapsack algorithm

Algorithm DKP(p, w, n, m, r, x)
{
W:=0;
P:=0;
for i:=1 to n
do{ x[i]:=choice(0,
1);
W:=W + x[i]*w[i];
P:=P + x[i]*p[i];
}
if( (W>m) or (P<r) ) then Failure();
else Success();
}
p given Profits w given Weights
n Number of elements (number of p or w) m Weight of bag limit
P Final Profit W Final weight
The Classes NP-Hard & NP-Complete:

For measuring the complexity of an algorithm, we use the input length as the parameter. For
example, An algorithm A is of polynomial complexity p() such that the computing time of A
is O(p(n)) for every input of size n.
Decision problem/ Decision algorithm: Any problem for which the answer is either zero or
one is decision problem. Any algorithm for a decision problem is termed a decision algorithm.
Optimization problem/ Optimization algorithm: Any problem that involves the
identification of an optimal (either minimum or maximum) value of a given cost function is
known as an optimization problem. An optimization algorithm is used to solve an optimization
problem.
P is the set of all decision problems solvable by deterministic algorithms in polynomial time.
NP is the set of all decision problems solvable by nondeterministic algorithms in polynomial
time.
Since deterministic algorithms are just a special case of nondeterministic, by this we

concluded that P ⊆NP
Commonly believed relationship between P & NP
The most famous unsolvable problems in Computer Science is Whether P=NP or P≠NP
In considering this problem, s.cook formulated the following question.
If there any single problem in NP, such that if we showed it to be in ‘P’ then that would imply
that P=NP.
Cook answered this question with
Theorem: Satisfiability is in P if and only if (iff) P=NP
Notation of Reducibility
Let L1 and L2 be problems, Problem L1 reduces to L2 (written L1 α L2) iff there is a way to
solve L1 by a deterministic polynomial time algorithm using a deterministic algorithm that
solves L2 in polynomial time
This implies that, if we have a polynomial time algorithm for L2, Then we can solve L1 in
polynomial time.
Here α is a transitive relation i.e., L1 α L2 and L2 α L3 then L1 α L3
A problem L is NP-Hard if and only if (iff) satisfiability reduces to L ie., Statisfiability α L
A problem L is NP-Complete if and only if (iff) L is NP-Hard and L Є NP
Commonly believed relationship among P, NP, NP-Complete and NP-Hard
Most natural problems in NP are either in P or NP-complete.
Examples of NP-complete problems:

Packing problems: SET-PACKING, INDEPENDENT-SET.
Covering problems: SET-COVER, VERTEX-COVER.
Sequencing problems: HAMILTONIAN-CYCLE, TSP.
Partitioning problems: 3-COLOR, CLIQUE.
Constraint satisfaction problems: SAT, 3-SAT.
Numerical problems: SUBSET-SUM, PARTITION, KNAPSACK.
Approximation Algorithms
Overview :
An approximation algorithm is a way of dealing with NP-completeness for an optimization
problem. This technique does not guarantee the best solution. The goal of the approximation
algorithm is to come as close as possible to the optimal solution in polynomial time. Such
algorithms are called approximation algorithms or heuristic algorithms.
Features of Approximation Algorithm :
Here, we will discuss the features of the Approximation Algorithm as follows.
 An approximation algorithm guarantees to run in polynomial time though it does not
guarantee the most effective solution.
 An approximation algorithm guarantees to seek out high accuracy and top quality
solution(say within 1% of optimum)
 Approximation algorithms are used to get an answer near the (optimal) solution of an
optimization problem in polynomial time
Randomized Algorithms
Randomized algorithms in data structures and algorithms (DSA) are algorithms that use
randomness in their computations to achieve a desired outcome. These algorithms introduce
randomness to improve efficiency or simplify the algorithm design. By incorporating random
choices into their processes, randomized algorithms can often provide faster solutions or better
approximations compared to deterministic algorithms. They are particularly useful in situations
where exact solutions are difficult to find or when a probabilistic approach is acceptable.
For example, in Randomized Quick Sort, we use a random number to pick the next pivot (or we
randomly shuffle the array). Typically, this randomness is used to reduce time complexity or
space complexity in other standard algorithms.
SRTING MATCHING
String matching algorithms have greatly influenced computer science and play an essential role
in various real-world problems. It helps in performing time-efficient tasks in multiple domains.
These algorithms are useful in the case of searching a string within another string. String
matching is also used in the Database schema, Network systems.
Let us look at a few string matching algorithms before proceeding to their applications in real
world. String Matching Algorithms can broadly be classified into two types of algorithms –
1. Exact String Matching Algorithms
2. Approximate String Matching Algorithms
Exact String Matching Algorithms:
Exact string matching algorithms is to find one, several, or all occurrences of a defined string
(pattern) in a large string (text or sequences) such that each matching is perfect. All alphabets of
patterns must be matched to corresponding matched subsequence. These are further classified
into four categories:
1. Algorithms based on character comparison:
 Naive Algorithm: It slides the pattern over text one by one and check for a match.
If a match is found, then slides by 1 again to check for subsequent matches.
 KMP (Knuth Morris Pratt) Algorithm: The idea is whenever a mismatch is
detected, we already know some of the characters in the text of the next window. So, we
take advantage of this information to avoid matching the characters that we know will
anyway match.
 Boyer Moore Algorithm: This algorithm uses best heuristics of Naive and KMP
algorithm and starts matching from the last character of the pattern.
 Using the Trie data structure: It is used as an efficient information retrieval data
structure. It stores the keys in form of a balanced BST.
2. Deterministic Finite Automaton (DFA) method:
 Automaton Matcher Algorithm: It starts from the first state of the automata and
the first character of the text. At every step, it considers next character of text, and look for
the next state in the built finite automata and move to a new state.
3. Algorithms based on Bit (parallelism method):
 Aho-Corasick Algorithm: It finds all words in O(n + m + z) time where n is the
length of text and m be the total number characters in all words and z is total number of
occurrences of words in text. This algorithm forms the basis of the original Unix command
fgrep.
4. Hashing-string matching algorithms:
 Rabin Karp Algorithm: It matches the hash value of the pattern with the hash
value of current substring of text, and if the hash values match then only it starts matching
individual characters.
Approximate String Matching Algorithms:
Approximate String Matching Algorithms (also known as Fuzzy String Searching) searches for
substrings of the input string. More specifically, the approximate string matching approach is
stated as follows: Suppose that we are given two strings, text T[1…n] and pattern P[1…m]. The
task is to find all the occurrences of patterns in the text whose edit distance to the pattern is at
most k. Some well known edit distances are – Levenshtein edit distance and Hamming edit
distance.
These techniques are used when the quality of the text is low, there are spelling errors in the
pattern or text, finding DNA subsequences after mutation, heterogeneous databases, etc. Some
approximate string matching algorithms are:
 Naive Approach: It slides the pattern over text one by one and check for approximate
matches. If they are found, then slides by 1 again to check for subsequent approximate
matches.
 Sellers Algorithm (Dynamic Programming)
 Shift or Algorithm (Bitmap Algorithm)

DAA-Unit-5 Notes

Uploaded by

Copyright:

Available Formats

DAA-Unit-5 Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DAA-Unit-5 Notes

Uploaded by

Copyright:

Available Formats

DAA NOTES

Dr. JYOTI SRIVASTAVA

BACKTRACKING AND BRANCH AND BOUND

Backtracking is a modified depth first search of a tree. Backtracking algorithms determine

On a chessboard, the solution

(a) (b) (c) (d)

(e) (f) (g) (h)

the above formulations, the solution space is 2n distinct tuples.

A possible solution space organisation for the

If (x [k] = 0) then return; // No new color possible

If (k = n)then // at most m colors have been

// used to color the n vertices.

Algorithm Next Value(k)

x [k]: = (x [k] +1) mod(m+1) // Next highest color.

If (x [k] = 0) then return; // All colors have been used

{ // check if this color is distinct from adjacent colors

A 4-node graph and all possible 3-colorin gs

Let G = (V, E) be a connected graph with n vertices. A Hamiltonian cycle (suggested by

Two graphs to illustrate Hamiltoniancycle

x [k] := (x [k] +1) mod(n+1); //Nextvertex.

If (x [k] = 0) then return;

If (G [x [k – 1], x [k]] !=0)then

A search strategy that uses a cost function

BFS and D-search are special cases of LC-search.If

c(.) is used to estimate c().This heuristic should be easy to computer and

initialize the list of live nodes to be empty;

if x is an answer node then output the path from x to t and

Add (x); //x is a new livenode.

write (“No answer

 For nodes representing infeasible solutions, c(x) =0.

Sum of Subsets Algorithm

Sum of Subsets State Space Tree

Example n=6, w[1:6]={5,10,12,13,15,18},m=30

Branch and Bound Principal

Control Abstraction for Branchand Bound(LC Method)

LC Method Control AbstractionExplanation

The 0/1 knapsack problem

The Branching Mechanism in the Branch-and-Bound Strategy to Solve 0/1KnapsackProblem.

How to find the upper bound?

How to expand the tree?

 By the best-first searchscheme

0/1 Knapsack algorithm using BB

0/1 Knapsack Example usingLCBB (Least Cost)

 (wi. w2, w3, w4) = (2, 4, 6, 9)and

0/1 Knapsack State Space tree ofExample using LCBB

0/1 Knapsack State Space tree ofExample using FIFO BB

The basic idea

There is a way to split the solution space (branch)

State Space tree of Example using LCMethod

NP –HARD AND NP-COMPLETE PROBLEMS.

NP Nondeterministic Polynomial time.

There are two classes of non-polynomial time problems

To specify nondeterministic algorithms, there are 3 new functions.

Choice(S) arbitrarily chooses one of the elements of sets S

Failure () Signals an Unsuccessful completion

Success () Signals a successful completion.

Example for Non Deterministic algorithms:

Nondeterministic Knapsack algorithm

The Classes NP-Hard & NP-Complete: