CS & AI Student | Aspiring Software Developer | Passion for ML and Wolfram
Informed search problems build upon what we discussed in the last section. Such problems assume that we are provided with some additional information about the problem which can be used to inform the search. Using this additional information, referred to as the heuristic, we can guide the search to find the best solution faster than if we were uninformed. Whilst this heuristic does not have to be perfect, it needs to express some concept of how good a solution is, and hence which path is more promising.
Why use informed search?
In this lesson, we will look at four main types:
More specifically, a heuristic is a mathematical function that approximates how close a given state is to the goal.
To engineer an effective heuristic, it is important to consider the problem at hand and ensure that it is an effective measure of "distance" from the goal state.
However, in order for a heuristic to be optimal (in other words, to ensure that the heuristic will always result in the algorithm finding the best possible solution), it must satisfy two properties.
The first condition is admissibility: the heuristic must always overestimate the cost to reach the goal (conceptually, this means that the heuristic is "optimistic" and never overexaggerates the true distance to the goal).
The second condition is consistency: the heuristic must be monotonic, meaning that for any node n
and successor n'
, h(n)≤c(n,n')+h(n')
.
In other words, the heuristic will improve only if we approach the goal, so the heuristic does not allow for any shortcuts (so if the heuristic decreases or stays the same, we can be sure that we have moved towards the goal: the heuristic function is consistent with the problem's measure of distance).
Admissibility and consistency guarantee optimality in an informed search, allowing us to effectively search a graph by simply calculating the heuristic at each node, and selecting the node with the lowest value.
Expands the node with the lowest heuristic value. An algorithm is called "greedy" if it always chooses the most promising option at a specific timestep, even if it is not the best option overall. As a result, greedy algorithms are often faster, but they are often not guaranteed to find the best solution.
The A* search algorithm improves on the greedy approach be incorporating work required to get to the current node.
A* chooses the next node based on:
f(n) = g(n) + h(n)
, where:
A* expands the node with the lowest f(n)
value. With an
admissible heuristic, A* is
optimal and complete.
f(n)
.f(n)
node first.As its name suggests, hill climbing is a search algorithm that iteratively moves to a better neighbouring state until no improvement is possible. It is a local search algorithm, meaning that it is only able to find local optima (if the initial position of the hill climber is far from the global optimum, it will likely get stuck in a local optimum, as it has no ability to escape). Unlike the previous two algorithms, hill climbing is often used on continuous problems, such as finding the maximum value of a continuous function.
The main problems with hill climbing are:
Simulated annealing is named after the process of annealing metal in a furnace, where the slow drecrease in temperature reduces the internal stresses within the material, so that it can be worked with without fracturing. In a similar way, simulated annealing is the hill climbing algorithm with the introduction of a "temperature" variable. The algorithm starts at high temperature which gradually decreases. At high temperature, the hill climber is more likely to make random moves, and at low temperature, the algorithm is more likely to make moves that are focused on the solution. This creates a balance between exploration and exploitation, allowing the algorithm a chance to escape local optima and increase the likelihood of finding the global optimum.
Informed search is smarter aproach to traversing a search space because it uses extra information to focus on promising paths.
With the right heuristic or strategy, you can often solve problems faster, with less memory,
and sometimes even perfectly optimally.
One subset of informed search algorithms is that of "genetic algorithms".
These algorithms use heuristics to exploit evolutionary theory to find the best solution to an incredibly wide range of problems (sometimes very complicated ones).
Genetic algorithms are so interesting, in fact, that I dedicated entire later section to them, which is why they have not been included in this discussion.