Jul
25
2009
6

TomTom Mobile GPS Nav App on the Nokia N97

Demo of the TomTom Mobile GPS Navigation Application on the Nokia N97 Symbian S60v5 phone. There is no current version of TomTom Mobile that is fully compatible with the S60v5 OS. The touchscreen does not work with this app and this version of TomTom cannot use the internal GPS receiver so you have to use an external Bluetooth GPS receiver with this.

Written by admin in: News | Tags: ,
Jul
22
2009
0

A star routing algorithm implementation

The A* Algorithm

The A* algorithm, stripped of all the code, is fairly simple. There are two sets, OPEN and CLOSED. The OPEN set contains those nodes that are candidates for examining. Initially, the OPEN set contains just one element: the starting position. The CLOSED set contains those nodes that have already been examined. Initially, the CLOSED set is empty. Graphically, the OPEN set is the “frontier” and the CLOSED set is the “interior” of the visited areas. Each node also keeps a pointer to its parent node so that we can determine how it was found.

There is a main loop that repeatedly pulls out the best node n in OPEN (the node with the lowest f value) and examines it. If n is the goal, then we’re done. Otherwise, node n is removed from OPEN and added to CLOSED. Then, its neighbors n' are examined. A neighbor that is in CLOSED has already been seen, so we don’t need to look at it. A neighbor that is in OPEN is scheduled to be looked at, so we don’t need to look at it now. Otherwise, we add it to OPEN, with its parent set to n. The path cost to n', g(n'), will be set to g(n) + movementcost(n, n').

OPEN = priority queue containing START
CLOSED = empty set
while lowest rank in OPEN is not the GOAL:
  current = remove lowest rank item from OPEN
  add current to CLOSED
  for neighbors of current:
    cost = g(current) + movementcost(current, neighbor)
    if neighbor in OPEN and cost less than g(neighbor):
      remove neighbor from OPEN, because new path is better
    if neighbor in CLOSED and cost less than g(neighbor): **
      remove neighbor from CLOSED
    if neighbor not in OPEN and neighbor not in CLOSED:
      set g(neighbor) to cost
      add neighbor to OPEN
      set priority queue rank to g(neighbor) + h(neighbor)
      set neighbor's parent to current

reconstruct reverse path from goal to start by following parent pointers

Data Structures

What’s the first thing you’ll think of using for the OPEN and CLOSED sets? You probably thought “array”. You may have thought “linked list”, too. There are many different data structures we can use, but to pick one we should look at what operations are needed.

There are three main operations we perform on the OPEN set: the main loop repeatedly finds the best node and removes it; the neighbor visiting will check whether a node is in the set; and the neighbor visiting will insert new nodes. Insertion and remove-best are operations typical of a priority queue.

The choice of data structure depends not only on the operations but on the number of times each operations runs. The membership test runs once for each neighbor for each node visited. Insertion runs once for each node being considered. Remove-best runs once for each node visited. Most nodes that are considered will be visited; the ones that are not are the fringe of the search space. When evaluating the cost of operations on these data structures, we need to consider the maximum size of the fringe (F).

In addition, there’s a fourth operation, which is relatively rare but still needs to be implemented. If the node being examined is already in the OPEN set (which does happen frequently), and if its f value is better than the one already in the OPEN set (which is rare), then the value in the OPEN set must be adjusted. The adjustment operation involves removing the node (which is not the best f) and re-inserting it. These two steps may be optimized into one that moves the node.

Unsorted arrays or linked lists

The simplest data structure is an unsorted array or list. Membership test is slow, O(F) to scan the entire structure. Insertion is fast, O(1) to append to the end. Finding the best element is slow, O(F) to scan the entire structure. Removing the best element is O(F) for arrays and O(1) for linked lists. The adjust operation is O(F) to find the node and O(1) to change its value.

Sorted arrays

To make remove-best fast, we can keep the array sorted. Membership is then O(log F), since we can use binary search. Insertion is slow, O(F) to move all the elements to make space for the new one. Finding the best is fast, O(1) since it’s already at the end. And removing the best is O(1) if we make sure the best sorts to the end of the array. The adjust operation is O(log F) to find the node and O(F) to change its value/position.

Sorted linked lists

With sorted arrays, insertion was slow. If we use a linked list, we can make that fast. Membership in the linked list is slow, O(F) to scan the list. Insertion is fast, O(1) to insert a new link, but it was O(F) to find the right position for it. Finding the best remains fast, O(1) because the best is at the end. Removing the best also is O(1). The adjust operation is O(F) to find the node and O(1) to change its value/position.

Sorted skip lists

Searching an unsorted linked list is slow. We can make that faster if we use a skip list instead of a linked list. With a skip list, membership is fast if you have the sort key: O(log F). Insertion is O(1) like a linked list if you know where to insert. Finding the best node is fast if the sort key is f, O(1), and removing a node is O(1). The adjust operation involves finding a node, removing it, and reinserting it.

If we use skip lists with the map location as the key, membership is O(log F), insertion is O(1) after we’ve performed the membership test, finding the best node is O(F), and removing a node is O(1). This is better than unsorted linked lists in that membership is faster.

If we use skip lists with the f value as the key, membership is O(F), insertion is O(1), finding the best node is O(1), and removing a node is O(1). This is no better than sorted linked lists.

Indexed arrays

If the set of nodes is finite and reasonably sized, we can use a direct indexing structure, where an index function i(n) maps each node n to an index into an array. Unlike the unsorted and sorted arrays, which have a size corresponding to the largest size of OPEN, with an indexed array the array size is always max(i(n)) over all n. If your function is dense (i.e., there are no indices unused), then max(i(n)) will be the number of nodes in your graph. Whenever your map is a grid, it’s easy to make the function dense.

Assuming i(n) is O(1), membership test is O(1), as we merely have to check whether Array[i(n)] contains any data. Insertion is O(1), as we just ste Array[i(n)]. Find and remove best is O(numnodes), since we have to search the entire structure. The adjust operation is O(1).

Hash tables

Indexed arrays take up a lot of memory to store all the nodes that are not in the OPEN set. An alternative is to use a hash table, with a hash function h(n) that maps each node n into a hash code. Keep the hash table twice as big as N to keep the chance of collisions low. Assuming h(n) is O(1), membership test is expected O(1), insertion is expected O(1), and remove best is O(numnodes), since we have to search the entire structure. The adjust operation is O(1).

Binary heaps

A binary heap (not to be confused with a memory heap) is a tree structure that is stored in an array. Unlike most trees, which use pointers to refer to children, the binary heap uses indexing to find children.

In a binary heap, membership is O(F), as you have to scan the entire structure. Insertion is O(log F) and remove-best is O(log F). The adjust operation is tricky, with O(F) to find the node and suprisingly, only O(log F) to adjust it.

Splay trees

Heaps are a tree-based structure with expected O(log F) time operations. However, the problem is that with A*, the common behavior is that you have a low cost node that is removed (causing O(log F) behavior, since values have to move up from the very bottom of the tree) followed by low cost nodes that are added (causing O(log F) behavior, since these values are added at the bottom and bubble up to the very top). The expected case behavior of heaps here is equivalent to the worst case behavior. We may be able to do better if we find a data structure where expected case is better, even if the worst case is no better.

Splay trees are a self adjusting tree structure. Any access to a node in the tree tends to bring that node up to the top. The result is a “caching” effect, where rarely used nodes go to the bottom and don’t slow down operations. It doesn’t matter how big your splay tree is, because your operations are only as slow as your “cache size”. In A*, the low cost nodes are used a lot, and the high cost nodes aren’t used for a long time, so those high cost nodes can move to the bottom of the tree.

With splay trees, membership, insertion, remove-best, and adjustment are all expected O(log F), worst case O(F). Typically however, the caching keeps the worst case from occurring. Dijkstra’s algorithm and A* with an underestimating heuristic however have some peculiar characteristics that may keep splay trees from being the best. In particular, f(n') >= f(n) for nodes n and neighboring node n'. When this happens, it may be that the insertions all occur on one side of the tree and end up putting it out of balance.

HOT queues

There’s another good data structure that may be better than heaps. Often, you can restrict the range of values that would be in the priority queue. Given a restricted range, there are often better algorithms. For example, sorting can be done on arbitrary values in O(N log N) time, but when there is a fixed range the bucket or radix sorts can perform sorting in O(N) time.

We can use HOT Queues (Heap On Top queues) to take advantage of f(n') >= f(n) where n' is a neighbor of n. We are removing the node n with minimal f(n), and inserting neighbors n' with f(n) <= f(n') <= f(n) + delta where delta <= C. The constant C is the maximum change in cost from one point to an adjacent point. Since f(n) was the minimal f value in the OPEN set, and everything being inserted is <= f(n) + delta, we know that all f values in the OPEN set are within a range of 0 .. delta. As in bucket/radix sort, we can keep “buckets” to sort the nodes in the OPEN set.

With K buckets, we reduce any O(N) cost to an average of O(N/K). With HOT queues, the topmost bucket uses a binary heap and all other buckets are unsorted arrays. Thus, its membership test for the top bucket is expected O(F/K), and insertion and remove-best are O(log (F/K)). Membership for the other buckets is O(F/K), insertion is O(1), and remove-best never occurs!. If the top bucket empties, then we need to convert the next bucket, an unsorted array, into a binary heap. It turns out this operation (“heapify”) can be run in O(F/K) time. The adjust operation is best treated as a O(F/K) removal followed by an O(log (F/K)) or O(1) insertion.

In A*, many of the nodes we put into OPEN we never actually need. HOT Queues are a big win because the elements that are not needed are inserted in O(1) time. Only elements that are needed get heapified (which is not too expensive). The only operation that is more than O(1) is node deletion from the heap, which is only O(log (F/K)).

In addition, if C is small, we can just set K = C, and then we do not even need a heap for the smallest bucket, since all the nodes in a bucket have the same f value. Insertion and remove-best are both O(1) time! One person reported that HOT queues are as fast as heaps for at most 800 nodes in the OPEN set, and are 20% faster when there are at most 1500 nodes. We would expect that HOT queues get faster as the number of nodes increases.

A cheap variant of a HOT queue is a two-level queue: put good nodes into one data structure (a heap or an array) and put bad nodes into another data structure (an array or a linked list). Since most nodes put into OPEN are “bad”, they are never examined, and there’s no harm putting them into the big array.

Comparison

It is important to keep in mind that we are not merely looking for asymptotic (“big O”) behavior. We also want to look for a low constant. To see why, consider an algorithm that is O(log F) and another that is O(F), where F is the number of elements in the heap. It may be that on your machine, an implementation of the first algorithm takes 10,000 * log(F) seconds, while an implementation of the second one takes 2 * F seconds. For F = 256, the first would take 80,000 seconds and the second would take 512 seconds. The “faster” algorithm takes more time in this case, and would only start to be faster when F > 200,000.

You cannot merely compare two algorithms. You should also compare the implementations of those algorithms. You also have to know what size your data might be. In the above example, the first implementation is faster for F > 200,000, but if in your map, F stays under 30,000, then the second implementation would have been better.

None of the basic data structures is entirely satisfactory. Unsorted arrays or lists make insertion very cheap and membership and removal very expensive. Sorted arrays or lists make membership somewhat cheap, removal very cheap and insertion very expensive. Binary heaps make insertion and removal somewhat cheap, but membership is very expensive. Splay trees make everything somewhat cheap. HOT queues make insertions cheap, removals fairly cheap, and membership tests somewhat cheap. Indexed arrays make membership and insertion very cheap, but removals are incredibly expensive, and they can also take up a lot of memory. Hash tables perform similarly to indexed arrays, but they can take up a lot less memory in the common case, and removals are merely expensive instead of extremely expensive.

For a good list of pointers to more advanced priority queue papers and implementations, see Lee Killough’s Priority Queues page.

Written by admin in: Routing | Tags: , ,
Jul
22
2009
0

Use of the Heuristic in A* routing algorithm

Heuristic

The heuristic function h(n) tells A* an estimate of the minimum cost from any vertex n to the goal. It’s important to choose a good heuristic function.

A*’s Use of the Heuristic

The heuristic can be used to control A*’s behavior.

  • At one extreme, if h(n) is 0, then only g(n) plays a role, and A* turns into Dijkstra’s algorithm, which is guaranteed to find a shortest path.
  • If h(n) is always lower than (or equal to) the cost of moving from n to the goal, then A* is guaranteed to find a shortest path. The lower h(n) is, the more node A* expands, making it slower.
  • If h(n) is exactly equal to the cost of moving from n to the goal, then A* will only follow the best path and never expand anything else, making it very fast. Although you can’t make this happen in all cases, you can make it exact in some special cases. It’s nice to know that given perfect information, A* will behave perfectly.
  • If h(n) is sometimes greater than the cost of moving from n to the goal, then A* is not guaranteed to find a shortest path, but it can run faster.
  • At the other extreme, if h(n) is very high relative to g(n), then only h(n) plays a role, and A* turns into BFS.

So we have an interesting situation in that we can decide what we want to get out of A*. At exactly the right point, we’ll get shortest paths really quickly. If we’re too low, then we’ll continue to get shortest paths, but it’ll slow down. If we’re too high, then we give up shortest paths, but A* will run faster.

In a mobile map application, this property of A* can be very useful. For example, you may find that in some situations, you would rather have a “good” path than a “perfect” path. To shift the balance between g(n) and h(n), you can modify either one.

Note:
Technically, the A* algorithm should be called simply A if the heuristic is an underestimate of the actual cost. However, we will continue to call it A* because the implementation is the same and the programming community does not distinguish A from A*.

Speed or accuracy?

A*’s ability to vary its behavior based on the heuristic and cost functions can be very useful in a mobile map application. The tradeoff between speed and accuracy can be exploited to make your application faster. For most mobile map applications, you don’t really need the best path between two points. You just need something that’s close. What you need may depend on how density of the map and how fast the mobile phone is.

The choice between speed and accuracy does not have to be global. You can choose some things dynamically based on the importance of having accuracy in some region of the map. For example, it may be more important to choose a good path near the current location, on the assumption that we might end up recalculating the path or changing direction at some point, so why bother being accurate about the faraway part of the path?

Scale

A* computes f(n) = g(n) + h(n). To add two values, those two values need to be at the same scale. If g(n) is measured in hours and h(n) is measured in meters, then A* is going to consider g or h too much or too little, and you either won’t get as good paths or you A* will run slower than it could.

Exact heuristics

If your heuristic is exactly equal to the distance along the optimal path, A* will expand very few nodes. What’s happening inside A* is that it is computing f(n) = g(n) + h(n) at every node. When h(n) exactly matches g(n), the value of f(n) doesn’t change along the path. All nodes not on the right path will have a higher value of f than nodes that are on the right path. Since A* doesn’t consider higher-valued f nodes until it has considered lower-valued f nodes, it never strays off the shortest path.

Manhattan distance

If your map is small (the number of routing nodes are not so huge), you can precompute the length of the shortest path between every pair of routing nodes to have exact heuristic. However, this is not feasible for most of the cases. Therefore, we suggest you use Manhattan distance as heuristic in your mobile map application. Manhattan distance is always >=  exact heuristic, we can find path that close to best path but the application runs fast. If you use Euclidean distance as heuristic, the application runs slower because Euclidean distance is always < exact heuristic.

Written by admin in: Routing | Tags: , , ,
OpenMobileMap would not be possible without the generous support from our sponsors:

J2MEGPS


OpenMobileMap | The Free Mobile Map Resources