Here are the lecture notes for UC Berkeley CS188 Lecture 3.
Heuristics
Heuristics are estimates of distance to goal states.
Heuristics are typically solutions to relaxed problems (where some of the constraints of the original problem have been removed).
With heuristics, it becomes very easy to implement logic in our agent that enables them to “prefer” expanding states that are estimated to be closer to goal states when deciding which action to perform.
Greedy Search
Greedy Search always selects the frontier node with the lowest heuristic value for expansion.
The frontier is represented by a priority queue.
Greedy search is not guaranteed to find a goal state if one exists, nor is it optimal. (If a very bad heuristic function is selected)
A* Search
A* search is a strategy for exploration that always selects the frontier node with the lowest estimated total cost for expansion.
A* search also uses a priority queue to represent its frontier.
A* search is both complete and optimal, if the heuristic is admissible.
- : The function representing total backwards cost computed by UCS.
- : The heuristic value function, or estimated forward cost, used by greedy search.
- : The function representing estimated total cost, used by A* search.
Admissibility
The condition required for optimality when using A* tree search is known as admissibility.
Defining as the true optimal forward cost to reach a goal state from a given node , we can formulate the admissibility constraint mathematically as follows:
Theorem. For a given search problem, if the admissibility constraint is satisfied by a heuristic function h, using A* tree search with h on that search problem will yield an optimal solution.
We can prove this theorem using proof by contradiction.
Graph Search
We keep track of which states you’ve already expanded, and never expand them again.
Tree search with this added optimization is known as graph search.
1 | function A*-GRAPH-SEARCH(problem, frontier) return a solution or failure |
Note that in implementation, it’s critically important to store the reached set as a disjoint set and not a list.
Dominance
If heuristic is dominant over heuristic , then the estimated goal distance for is greater than the estimated goal distance for for every node in the state space graph. Mathematically,
Dominance very intuitively captures the idea of one heuristic being better than another - if one admissible heuristic is dominant over another, it must be better because it will always more closely estimate the distance to a goal from any given state.
Additionally, the trivial heuristic is defined as , and using it reduces A* search to UCS.
All admissible heuristics dominate the trivial heuristic.
As a general rule, the max function applied to multiple admissible heuristics will also always be admissible.
Consistency
A heuristic function is said to be consistent, or monotone, if its estimate is always less than or equal to the estimated distance from any neighbouring vertex to the goal, plus the cost of reaching that neighbour.
Mathematically, this is expressed as the triangle inequality:
where is the actual cost of the edge from to .
If is consistent, the first time A* reaches a node, it is guaranteed to be via the optimal path.