Tom Pecher | Portfolio

Section 2: Informed Search Algorithms

The Premise

Informed search problems build upon what we discussed in the last section. Such problems assume that we are provided with some additional information about the problem which can be used to inform the search. Using this additional information, referred to as the heuristic, we can guide the search to find the best solution faster than if we were uninformed. Whilst this heuristic does not have to be perfect, it needs to express some concept of how good a solution is, and hence which path is more promising.

Why use informed search?

Informed searches explore far fewer nodes than and uninformed search.
Since they are much faster, they allow us to explore larger search spaces.
Although heuristics may vary in quality, engineering the heuristic correctly can guarantee that the search finds best solution.

In this lesson, we will look at four main types:

Greedy Best-First Search - follows the most promising-looking path.
A* Search - optimal search balancing cost so far with estimated cost to goal.
Hill Climbing - local search improving step-by-step.
Simulated Annealing - hill climbing with occasional worse moves to escape traps.

Key Insight: The Heuristic

More specifically, a heuristic is a mathematical function that approximates how close a given state is to the goal. To engineer an effective heuristic, it is important to consider the problem at hand and ensure that it is an effective measure of "distance" from the goal state. However, in order for a heuristic to be optimal (in other words, to ensure that the heuristic will always result in the algorithm finding the best possible solution), it must satisfy two properties. The first condition is admissibility: the heuristic must always overestimate the cost to reach the goal (conceptually, this means that the heuristic is "optimistic" and never overexaggerates the true distance to the goal). The second condition is consistency: the heuristic must be monotonic, meaning that for any node n and successor n', h(n)≤c(n,n')+h(n'). In other words, the heuristic will improve only if we approach the goal, so the heuristic does not allow for any shortcuts (so if the heuristic decreases or stays the same, we can be sure that we have moved towards the goal: the heuristic function is consistent with the problem's measure of distance). Admissibility and consistency guarantee optimality in an informed search, allowing us to effectively search a graph by simply calculating the heuristic at each node, and selecting the node with the lowest value.

Greedy Best-First Search

Expands the node with the lowest heuristic value. An algorithm is called "greedy" if it always chooses the most promising option at a specific timestep, even if it is not the best option overall. As a result, greedy algorithms are often faster, but they are often not guaranteed to find the best solution.

Fast, often explores fewer nodes than A*.
Not guaranteed to find the shortest path.
Can loop without tracking visited states.

A* Search Algorithm

The A* search algorithm improves on the greedy approach be incorporating work required to get to the current node. A* chooses the next node based on: f(n) = g(n) + h(n), where:

g(n) : cost from start to this node.
h(n) : heuristic estimate of cost from this node to the goal.
f(n) : total estimated cost through this node.

A* expands the node with the lowest f(n) value. With an admissible heuristic, A* is optimal and complete.

Example Heuristics:

Maze navigation: straight-line distance to the goal.
Route planning: estimated driving time.

Implementation

Uses a priority queue ordered by f(n).
Expands the smallest f(n) node first.

Hill Climbing

As its name suggests, hill climbing is a search algorithm that iteratively moves to a better neighbouring state until no improvement is possible. It is a local search algorithm, meaning that it is only able to find local optima (if the initial position of the hill climber is far from the global optimum, it will likely get stuck in a local optimum, as it has no ability to escape). Unlike the previous two algorithms, hill climbing is often used on continuous problems, such as finding the maximum value of a continuous function.

Variants

Steepest Ascent - choose the best neighbour.
Stochastic - choose a random improving move.
First-Choice - pick the first improving move found.

Problems

The main problems with hill climbing are:

Local maxima - stuck on small peaks.
Plateaus - flat areas with no improvement.
Ridges - zigzag paths to higher ground.

These can be partially alleviated by using occasionally moving the climber to a random position are continuing the algorithm from there.

Simulated Annealing

Simulated annealing is named after the process of annealing metal in a furnace, where the slow drecrease in temperature reduces the internal stresses within the material, so that it can be worked with without fracturing. In a similar way, simulated annealing is the hill climbing algorithm with the introduction of a "temperature" variable. The algorithm starts at high temperature which gradually decreases. At high temperature, the hill climber is more likely to make random moves, and at low temperature, the algorithm is more likely to make moves that are focused on the solution. This creates a balance between exploration and exploitation, allowing the algorithm a chance to escape local optima and increase the likelihood of finding the global optimum.

Temperature Scheduling

High temperature - more random moves.
Low temperature - focused search.

Worse moves are accepted with a probability that decreases as temperature drops.

Applications

Travelling Salesman Problem (TSP).
Scheduling problems.
Circuit design.

Conclusion

Informed search is smarter aproach to traversing a search space because it uses extra information to focus on promising paths. With the right heuristic or strategy, you can often solve problems faster, with less memory, and sometimes even perfectly optimally. One subset of informed search algorithms is that of "genetic algorithms". These algorithms use heuristics to exploit evolutionary theory to find the best solution to an incredibly wide range of problems (sometimes very complicated ones). Genetic algorithms are so interesting, in fact, that I dedicated entire later section to them, which is why they have not been included in this discussion.

Previous Course Home Next