The Basics of A* pathfinding – part 2

This post will discuss some finer A* points like how to get the path to the closest location to an unreachable destination and how to mark terrain as more or less preferable for walking (road vs. swamp).

Gracefully dealing with unreachable destinations

Part 1 of these series describes how the A* algorithms works. The example given there details its workings for a scenario where a path actually exists from the start to the end node (i.e. the destination node is reachable).

However, for a situation where the destination cannot be reached from our current start node, A* all by itself has 2 major issues:

– issue 1: it will investigate ALL reachable nodes (flood) before realizing that the destination node is unreachable
– issue 2: after doing all that, it won’t even come up with a path to the node closest to that unreachable destination. It will instead stop on some random node, which may or may not be closest to the destination node.

Now, in short, there’s no way to completely get rid of issue 1. There’s just some common sense things you can do to alleviate it to some extent. Issue 2 on the other hand has a great fix (which is already implemented in the examples, like the A* JavaScript Demo App ). Let’s discuss each of these.

To understand why A* investigates all reachable nodes when the destination is unreachable, we’ll need to take a look at how A* came to be and where its roots are. Here’s a brief, abridged and unofficial history of it: At some point, there was only Dijkstra’s algorithm. Not to get in to all the fine details: this algorithm’s main purpose was to find the shortest path from a given node TO ALL OTHER NODES (not to just ONE certain destination node). It could be used to just find the shortest path between 2 points (by simply starting it on the start node, and then stopping it when the desired destination node was reached) but that was not its main point. Due to its way of doing things, this algorithm is widely used in computer networks to find all the existing shortest paths between multiple, randomly “placed” network nodes. Basically a lot of network rooters have this algorithm or some variation on it implemented and are using it to find out how to reach this or that router/network segment, etc.

Now, the biggest difference between Dijkstra and A* (what A* brought as a major new optimization) is the heuristics: the “guesstimation” A* uses to only explore the most promising nodes (and not everything in any direction from the start node). Simply put, aside from calculating how much it costs to get from the start node to the currently investigated node (which Dijkstra and A* calculate roughly in the same way, it’s what’s called the g score in A*) A* also calculates the distance between the currently investigated node and the destination if no obstacles were between them (the shortest distance) and picks the next node to investigate based (also) on which node has the shortest such distance.

This is not really a shortcoming of Dijkstra. It’s just a particular optimization for a certain context, namely the one where we only care about the path between 2 certain nodes, and not one node and many other nodes thrown randomly around it. Dijkstra still works great in the 2nd case: there’s no need (sense) for heuristics when you don’t know where your destination is (aka, you’ll know it when you find it). Also Dijkstra calculates ALL paths to ALL possible nodes in ONE go. For situations where the “terrain” doesn’t change a lot, and there’s many destinations, Dijkstra is great. Usually in video games, this is not so much the case, as more often than not obstacles will change all the time (think real time strategy games, where a bridge can be choked and un-choked by moving units all the time) and, for a given cell/node/location we’re only interested in the shortest path to one certain node (not multiple nodes). It’s this context that A* is optimized for.

Basically, if we “turn off” A*’s heuristics, it will behave roughly like Dijkstra. Here’s how the A* Demo App would behave if we’d modify the A* code such that it always assigns 0 for the h score (the heuristics):

As a reminder: the black cells represent non-walkable obstacles. The white cells are walkable. The “S” and the “E” marked cells represent the start and the end nodes, respectively. The red cells represent the path and finally, the cells marked with a dot “.” represent the other cells investigated during the path calculation (but which were found to not make the shortest path). The “radius pattern” is pretty obvious: roughly all nodes in a radius equal to the straight distance between the start and the finish cells have been investigated in order to find the path. You can see in the console that it says that 285 nodes were explored.

Now, when using heuristics, “true” A* will explore a lot less nodes to obtain the same path:

This time around, only 54 nodes were explored in order to get the same path. And you can see that while it may sometimes stray a bit, most of the time A* heuristics guide the nodes exploration in the good direction, towards the destination node.

So here’s how things go bad when you tell A* to find the distance to an unreachable node: what this basically does is turn A* into Dijkstra (the heuristics are useless). Yes, at first its heuristics will guide it towards the destination node, but since it’s unreachable, at some point A* will hit obstacles, and desperately check every other reachable node:

In the above picture we have a “wall” of black cells, completely separating the upper part of our area from the rest. If we place our start node in the lower part of this wall, and our destination (end) node in the part above the wall, we can see that this makes A* check every single reachable cell before declaring the destination unreachable.

There’s no actual way to avoid this with pure A*. We can however, do common sense things to minimize the number of situations where A* will be forced to calculate a path to an unreachable destination, like marking certain isolated areas of our in-game maps as “un-selectable for path destination”, aka, if the player wants to send his units there no A* calculation is triggered, instead, they just won’t move. Of course, there’s also dynamic unreachable areas, created by units chocking a single bridge to a certain area etc, for this situation the previous solution won’t help. In this case, a possible improvement may be to set a maximum number of explored nodes that A* can have before stopping a certain path calculation. However this may lead to having the game unit stop before reaching its destination and/or not taking the shortest path to it. Another thing that may help is to implement A* such that for a given path calculation, it will only do a maximum number of iterations per game state update. This helps, especially in situations where the player selects a large group of units and sends them all to an unreachable destination: if we were to do the whole A* for each of them in one game state update they player would experience a visible “hiccup” in the game play, where nothing gets updated for a long while (few seconds) and then everything catches up all of a sudden. By spreading the A* iterations over multiple game state updates the over all game will be smoother (though, those specific units may seem like they’re not responsive enough to player input, if it takes too many game state updates to finish the A* calculations).

Now, in the previous picture, the one showing the path to the unreachable destination, we can see that in fact the path does end on a reachable node closest to our destination. That’s because the demo app uses an implementation of A* which has already been modified to do this.

On it’s own, pure A* will simply stop on some random (reachable) node when trying to get to an unreachable destination. That node may be the closest to the destination or it may not. The way A*’s heuristics operate, there’s a better chance of it NOT being the closest one.

Fixing this is pretty easy: when we say “closest node to the destination node” we’re actually meaning, “the node which has the shortest distance between it and the destination node”, and this simply means the “h” score. So all we have to do is to keep track, during A*’s path calculating iterations, of the node with the lowest “h” score. And them if the A* loop finishes without finding the destination node (gets to have an empty open list), then we simply get that lowest h node, and construct a path by backtracking from its parent and its parent’s parent and so on up until the start node.

All of the implementations (JavaScript, Java, C) given as example contain code similar to this in their “processSucessors” method/function:

        neighbor.h = this.distanceCalculator.getDistance(neighbor, finish);
        if(this.closestNode == null || this.closestNode.h > neighbor.h) { 
	                 this.closestNode = neighbor;

and again, the main A* loop of each contains something like this:

	if(arrivedAtDestination) {
		endNode = currentNode;
	} else if(this.closestNode != null) {
		endNode = this.closestNode;

Marking terrain as more/less preferable for walking

Lets start with an example. Imagine you have a shallow river that is crossed by a bridge at some point:

For the sake of argument, we’ll assume that the river is walkable, but that walking through the river is a lot slower (less desirable) than walking on the bridge. This means that if we’d like to cross the river, there’s a balance between taking a detour to use the bridge and simply crossing the river:

This can be controlled within A* by assigning different g-score values to different nodes. Namely, if we want a path calculation to lean towards certain nodes and avoid others, after calculating the actual g score for each of those nodes, we’ll artificially increase the g score of those nodes we want the path to avoid, by an amount proportional with how much we want them to be avoided. Note that doing this will never make the node completely “un-walkable”: if a node with a huge g score (but which is not specifically marked as un-walkable) is the only way to get to the destination, a path through it will be returned by A*.

Let’s use the JavaScript Demo App to try to approximate the terrain described in the images above, and see how A* behaves:

The grey cells represent the river and surrounding rocks which are deemed as “walkable, but not as preferable as flat ground and bridge terrain”. The white cells represent the bridge and the flat terrain, which is the most preferable for passing through, and the black cells represent the various un-walkable things like houses and trees and such.

We can see that if we place our start close enough to the bridge, and our destination on the other side of the bridge, A* will calculate a path going over the bridge:

However, if our start cell is sufficiently far from the bridge, A* will consider walking through the river as less costly rather than doing the detour to pass via the bridge:

Now, the A* implementation exemplified here deals with this in the simplest way possible: the API allows for a cell to be marked as “somewhat passable”. There’s also a “multiplicator” value that can be passed on when a path calculation is demanded. This way, after the regular g score for a “somewhat passable” cell is calculated, the fixed value of “one unit of walk multiplied by the multiplicator” will be added to it, making it less desirable for path-finding when the multiplicator is greater than 0:

var somewhatPassableGAddendum = UNIT_SIZE*somewhatPassableCostMultiplicator;


var newG = currentNode.g + this.distanceCalculator.getDistance(currentNode, neighbor);

if(area.getTerrainType(neighbor.x, neighbor.y) == TerrainType.SOMEWHAT_PASSABLE) {
	newG = newG+somewhatPassableGAddendum;


neighbor.g = newG;

Now, what this does is only allow to have one kind of “less preferable terrain”. For example if you’d like to have shallow river and swamp, with shallow river being less preferable than regular terrain but more preferable than swamp, the example A* implementation given here would have to be modified, such that when the terrain is created (the Area object) we can specify individual values to add to the g score for each cell.

As a final note on this: adding different degrees of “walkability” to terrain makes for more nodes to be explored when a path is calculated. For example: if the only way to get to a destination is through some nodes with really low desirability, A* will explore a lot of surrounding nodes with more desirable terrain (but which lead in the wrong direction from the destination node) before choosing to go through the low-desirability ones.

This entry was posted in Examples, Game Dev and tagged , , , , , , . Bookmark the permalink.

One Response to The Basics of A* pathfinding – part 2

  1. Pingback: Using A* with Dynamic Obstacles | Shivan's Blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s