Nothing Special   »   [go: up one dir, main page]

Heuristic

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Informed Search

Informed Search

• Informed search strategy uses problem-specific knowledge


beyond the definition of the problem itself and can find
solutions more efficiently than an uninformed strategy.
• The general approach of Informed Search is called Best-First
Search.
• Best-first search is an instance of the general
TREE-SEARCH or GRAPH-SEARCH algorithm in which a
node is
selected for expansion based on an Evaluation function, f(n).
• The evaluation function is construed as a cost estimate, so the
node with the lowest evaluation is expanded first.
Heuristic Function

• The choice of evaluation function f determines the search


strategy.

• Best-first algorithms include as a component of f a heuristic


function, denoted by h(n)

• h(n) = estimated cost of the cheapest path from the state at


node n to a goal state, if n is a goal node, then h(n) = 0.
• Heuristic functions are the most common form in which
additional knowledge of the problem is imparted to the search
algorithm.
Evaluation
• Evaluation function f(n) = g(n) + h(n)
– g(n) = exact cost so far to reach n
– h(n) = estimated cost to goal from n
– f(n) = estimated total cost of cheapest path
through n to goal
• At goal state h(n) = 0
• Special cases:
– Uniform Cost Search: f(n) = g(n)
– Greedy (best-first) Search: f(n) = h(n)
– A* Search: f(n) = g(n) + h(n)
8 Puzzle Problem

24 January 2024 5
24 January 2024 6
Greedy best-first search
• Greedy best-first search tries to expand the node that is
closest to the goal, on the grounds that this is likely to lead to a
solution quickly.
• It evaluates nodes by using just the heuristic function; that is,
f(n) = h(n).
• For route-finding problems in Romania; we use the straight
line distance as heuristic, which is hSLD.
• If the goal is Bucharest, we need to know the straight-line
distances to Bucharest from any node say, n.
• For example, hSLD(Arad) = 366.
Greedy best-first search

• The values of hSLD cannot be computed from the problem


description itself.

• Moreover, it takes a certain amount of experience to know


that hSLD is correlated with actual road distances and is,
therefore, a useful heuristic.

• For this particular problem, greedy best-first search using hSLD


finds a solution without ever expanding a node that is not on
the solution path; hence, its search cost is minimal.
• However, the path via Sibiu and Fagaras to Bucharest is 32
kilometer longer than the path through Rimnicu Vilcea and Pitesti.

• This shows why the algorithm is called “Greedy”—at each step it


tries to get as close to the goal as it can.

• Greedy best-first tree search is also incomplete even in a finite state


space, much like depth-first search.

• Consider the problem of getting from Iasi to Fagaras.

• The heuristic suggests that Neamt be expanded first because it is


closest to Fagaras, but it is a dead end.

• The solution is to go first to Vaslui—farther from the goal according


and then to continue to Urziceni, Bucharest, and Fagaras.
Property of Greedy Best First Search Algorirhm
• Complete? No – can get stuck in loops, e.g., with Oradea as
goal and start from Iasi:
– Iasi 🡪 Neamt 🡪 Iasi 🡪 Neamt 🡪
Complete in finite space with repeated state checking
• Time? O(bm) where m is the maximum depth of the search space.
• Space? O(bm) -- keeps all nodes in memory
• Optimal? No.

• With a good heuristic function, however, the complexity can be


reduced substantially.
• The amount of the reduction depends on the particular problem
and on the quality of the heuristic.
A* Search

• The most widely known form of best-first search is called A*


search.
• A* search evaluates nodes by combining g(n), the cost to reach
the node, and h(n), the cost to get from the node to the goal:
f(n) = g(n) + h(n)
• g(n) gives the path cost from the start node to node n, and h(n)
is the estimated cost of the cheapest path from n to the goal, so
f(n) = estimated cost of the cheapest solution through n .

• The algorithm is identical to UNIFORM-COST-SEARCH


except that A* uses g + h instead of g.
• Evaluation function f(n) = g(n) + h(n)
– g(n) = exact cost so far to reach n
– h(n) = estimated cost to goal from n
– f(n) = estimated total cost of cheapest path through
n to goal
Conditions for Optimality: Admissibility and
Consistency
• The first condition for optimality is that h(n) be an admissible
heuristic.
• An admissible heuristic is one that never overestimates the cost to
reach the goal.
• Because g(n) is the actual cost to reach n along the current path, and
f(n) = g(n) + h(n), we have as an immediate consequence that f(n)
never overestimates the true cost of a solution along the current path
through n.
• Admissible heuristics are by nature optimistic because they think
the cost of solving the problem is less than it actually is.
• An obvious example of an admissible heuristic is the straight-line
distance hSLD that we used in getting to Bucharest.
• Straight-line distance is admissible because the shortest path
between any two points is a straight line, so the straight line
cannot be an overestimate.

• Bucharest first appears on the frontier at step (e), but it is not


selected for expansion because its f-cost (450) is higher than
that of Pitesti (417).

• Another way to say this is that there might be a solution


through Pitesti whose cost is as low as 417, so the algorithm
will not settle for a solution that costs 450.
• Suppose some suboptimal goal G2 has been generated and is in the fringe.
Let n be an unexpanded node in the fringe such that n is on a shortest path
to an optimal goal G.

• f(G2) = g(G2) since h(G2) = 0


• g(G2) > g(G) since G2 is non-optimal
• f(G) = g(G) since h(G) = 0
• f(G2) > f(G)
❑ Suppose some suboptimal goal G2 has been generated and is in the fringe. Let n
be an unexpanded node in the fringe such that n is on a shortest path to an
optimal goal G.

❑ f(G2) > f(G)


❑ h(n) ≤ h*(n) since h is admissible
❑ g(n) + h(n)≤ g(n) + h*(n)
❑ f(n) ≤ f(G)

Hence f(G2) > f(n), and A* will never select G2 for expansion
Each equation has to satisfy its condition to have an admissible heuristic.
If one of them is not admissible, then we say that it is not an admissible
heuristic.
Maintain a closed-list containing those nodes that have already been
expanded. Then, if a node is encountered that is already in closed-list, it is
simply ignored
This guarantees that no loops are generated, and essentially converts the
graph
24 January 2024into a tree 20
• Its very well to prune the search space by ignoring
repeated states
• But the problem is that Graph-search can end up
discovering sub-optimal solutions
– Basically, a loop means that there might be more then
one path to a node
– Once we discover a path to a node, then any other
paths to the same node are ignored by graph-search
– However, it is not necessary that the first path is the
optimal one
– Hence, sub-optimal solutions can be returned.
24 January 2024 21
• Solution?
– We can adopt a uniform-cost approach, in which
we keep track of all the paths that have been
currently generated
– Then, we select only that path which has the
least-cost.

24 January 2024 22
• A heuristic is consistent if for every node n, every successor n' of n
generated by any action a,
h(n) ≤ c(n,a,n') + h(n')

• If h is consistent,
f(n') = g(n') + h(n')
= g(n) + c(n,a,n') + h(n')
≥ g(n) + h(n)
≥ f(n)

• i.e., f(n) is non-decreasing along any path.


• It is fairly easy to show that every consistent heuristic is also
admissible.
• Consistency is therefore a stricter requirement than admissibility.
• Consider, for example, hSLD. the general triangle inequality is
satisfied when each side is measured by the straight-line distance
and that the straight-line distance between n and n’ is no greater
than c(n, a, n’).
• Hence, , hSLD is a consistent heuristic.
• A* properties: the tree-search version of A* is optimal if h(n) is
admissible, while the graph-search version is optimal if h(n) is
consistent.
• If h(n) is consistent, then the values of f(n) along any path are
non-decreasing.
• A* expands nodes in order of increasing f value
• Gradually adds "f-contours" of nodes
• Contour i has all nodes with f=fi, where fi < fi+1
• The fact that f-costs are nondecreasing along any path also means
that we can draw contours in the state space.

• Inside the contour labeled 400, all nodes have f(n) less than or equal
to 400,

• Because A* expands the frontier node of lowest f-cost, we can see


that an A* search fans out from the start node, adding nodes in
concentric bands of increasing f-cost.

• With uniform-cost search (A* search using h(n) = 0), the bands will
be “circular” around the start state.

• With more accurate heuristics, the bands will stretch toward the
goal state and become more narrowly focused around the optimal
path.
Any consistent heuristic is also admissible. But when is a
heuristic admissible but not consistent (monotone)?

1. The cost estimate of getting to the goal through


the parent node is at least 10 (because the cost of the
path to p is 5 and the heuristic estimate at p is also
5).

2. The cost estimate for getting to the goal


through c1, however, is just 8

Since this graph is undirected, this heuristic is also inconsistent at c2, because
going from c2 to p has the same problem.
The classic heuristic for this problem (Manhattan distance of each tile to
the location where it is supposed to be) is admissible and consistent.
Checking Monotonicity

If one of the equations is not satisfied, then the heuristic is not monotonic.
• If C* is the cost of the optimal solution path, then we can say
the following:
• A* expands all nodes with f(n) < C*.

• A* might then expand some of the nodes right on the “goal


contour” (where f(n) = C*) before selecting a goal node.

• Completeness requires that there be only finitely many nodes


with cost less than or equal to C*, a condition that is true if all
step costs exceed some finite and if b is finite.

• Notice that A* expands no nodes with f(n) > C*—for example,


Timisoara is not expanded in Figure even though it is a child
of the root.
Notice that A* expands no nodes with f(n) > C*—for example,
Timisoara (118+329 = 447) is not expanded in Figure even
though it is a child of the root.
Pruning
• We say that the subtree below Timisoara is pruned; because
hSLD is admissible, the algorithm can safely ignore this subtree
while still guaranteeing optimality.

• The concept of pruning—eliminating possibilities from


consideration without having to examine them—is important
for many areas of AI.

• One final observation is that among optimal algorithms of this


type—algorithms that extend search paths from the root and
use the same heuristic information A* is optimally efficient
for any given consistent heuristic (no other optimal
algorithm is guaranteed to expand fewer nodes than A*)
Complexity

• The complexity results depend very strongly on the assumptions


made about the state space.

• The simplest model studied is a state space that has a single goal
and is essentially a tree with reversible actions.

• The complexity of A* often makes it impractical to insist on finding


an optimal solution.

• The use of a good heuristic still provides enormous savings


compared to the use of an uninformed search.

• A* algorithm often runs out of space (because it keeps all generated


• nodes in memory) before it runs out of time.
Memory-bounded heuristic search

• The simplest way to reduce memory requirements for A* is to adapt


the idea of iterative deepening to the heuristic search context,
resulting in the iterative-deepening A* (IDA*) algorithm.

• The main difference between IDA* and standard iterative


deepening is that the cutoff used is the f-cost (g+h) rather than the
depth.

• At each iteration, the cutoff value is the smallest f-cost of any node
that exceeded the cutoff on the previous iteration.

• IDA* is practical for many problems with unit step costs and avoids
the substantial overhead associated with keeping a sorted queue of
nodes.
Recursive best-first search (RBFS)
• RBFS structure is similar to that of a recursive depth-first search,
but rather than continuing indefinitely down the current path, it uses
the f limit variable to keep track of the f-value of the best alternative
path available from any ancestor of the current node.
• If the current node exceeds this limit, the recursion unwinds back to
the alternative path.
• As the recursion unwinds, RBFS replaces the f-value of each node
along the path with a backed-up value—the best f-value of its
children.
• In this way, RBFS remembers the f-value of the best leaf in the
forgotten subtree and can therefore decide whether it’s worth
reexpanding the subtree at some later time.
Heuristic search Problems

The average solution cost for a randomly generated 8-puzzle instance is about 22 steps with
branching factor is about 3.
This means that an exhaustive tree search to depth 22 would look at about 322 ≈ 3.1×1010
states.
A graph search would cut this down to 181, 440 distinct states, which is manageable
• h1 = the number of misplaced tiles
• All of the eight tiles are out of position, so the start state would
have h1= 8.
• h1 is an admissible heuristic because it is clear that any tile
that is out of place must be moved at least once.

• h2 = the sum of the distances of the tiles from their goal


positions.

• Tiles 1 to 8 in the start state give a Manhattan distance of


h2 = 3+1 + 2 + 2+ 2 + 3+ 3 + 2 = 18 .
As expected, neither of these overestimates the true solution cost,
which is 26.
• h2 = the sum of the distances of the tiles from their goal
positions.
• Because tiles cannot move along diagonals, the distance we
will count is the sum of the horizontal and vertical distances.
• This is sometimes called the city block distance or
Manhattan distance.
• h2 is also admissible because all any move can do is move
one tile one step closer to the goal.
• Tiles 1 to 8 in the start state give a Manhattan distance of
• h2 = 3+1 + 2 + 2+ 2 + 3+ 3 + 2 = 18 .
• As expected, neither of these overestimates the true solution
cost, which is 26.
The effect of heuristic accuracy

• One way to characterize the quality of a heuristic is the


effective branching factor b*.
• If the total number of nodes generated by A* for a particular
problem is N and the solution depth is d, then b* is the
branching factor that a uniform tree of depth d would have to
have in order to contain N + 1 nodes = 1+b* + (b*)2 + ・ ・ ・ +
(b*)d
• For example, if A* finds a solution at depth 5 using 52 nodes,
then the effective branching factor is 1.92.
Summary

• A general TREE-SEARCH algorithm considers all possible


paths to find a solution, whereas a GRAPH-SEARCH
algorithm avoids consideration of redundant paths.

• Search algorithms are judged on the basis of completeness,


optimality, time complexity, and space complexity.

• Informed search methods may have access to a heuristic


function h(n) that estimates the cost of a solution from n.
• The generic best-first search algorithm selects a node for
expansion according to an evaluation function.

• Greedy best-first search expands nodes with minimal h(n). It


is not optimal but is often efficient.

• A* search expands nodes with minimal f(n) = g(n) + h(n). A*


is complete and optimal, provided that h(n) is admissible (for
TREE-SEARCH) or consistent (for GRAPH-SEARCH).

• – RBFS (recursive best-first search) is robust, optimal search


algorithms that use limited amounts of memory; given enough
time, they can solve problems that A* cannot solve because it
runs out of memory.
BASIS FOR COMPARISON INFORMED SEARCH UNINFORMED SEARCH

Basic Uses knowledge to find the No use of knowledge


steps to the solution.

Efficiency Highly efficient as consumes Efficiency is mediatory


less time and cost.

Cost Low Comparatively high

Performance Finds solution more quickly Speed is slower than informed


search

Algorithms Best-first search, and A* Depth-first search,


search breadth-first search and
24 January 2024
lowest cost first search 44

You might also like