Path Finding - System Design and Implementation

Chapter 3 System Design and Implementation

3.3 Path Finding

In this chapter, a router operation of GIS system is a common function. Router operation is a search function, which mean there exist a path from start node to goal node or there does not exist a path. In the thesis, groups are referred as nodes of a graph and relation of two groups is referred as an edge of graph. Although there have many algorithms for path finding problem, A-star is an effective algorithm of them.

A-star is described as following. A-star is a best-first algorithm for general search of optimal paths from node A to the other. Dijkstra’s algorithm is a similar algorithm with A-star. The concept of A-star is the information of start node and goal node. In figure 3-9, there has a shortest path P, a node N is a node of the shortest path. The reason of a node N selected is that the cost form start node to node N is minimal in Dijkstra’s algorithm and the cost form start node to node N and form node N to goal node is minimal in A-star algorithm.

Figure 3-9 The shortest path in a graph.

A-star algorithm is a heuristic algorithm which ranks each node by an estimate of best-first choice through the node. The typical formula is described as follow:

( ) ( ) ( )

, node n and node n-1 are not neighbors

h n is a method of heuristic estimate for the remaining path in best-first search.

A-star is guaranteed to find the shortest path, as long as the heuristic estimate, ,is admissible that is, it is never greater than the true remaining distance to the goal node.

( ) h n

The greedy method is only estimate the f value of neighbors of node n which selected node of minimal cost.

e parent node of node n.

Let lat and lng are latitude and longitude of node lat and lng are latitude and longitude of goal node rad(d)=d , EARTH_RADIUS = 6378.137

d(lng ), radlat rad(lat ),radlng rad(lng ) radlat radlng

radlat radlng

( )=2 Asin( sin( ) cos(radlat ) cos(radlat ) sin( ) ) EARTH_RADIUS

2 2

( ) is a distance function betwe a

en two points of the earth.

This is satisfied that it is never greater than the true remaining distance to the goal node. The straight distance is the minimal distance between two points. Hence

is satisfied this property.

( ) h n

A-star algorithm is described as following:

A-star(start node , goal node )

= Remove lowest rank item from

for( neighbors of )

reconstruct reverse path from goal to start by following parent pointers }

a point P1 to Pn .

3.4 Summary

In this chapter, we present the algorithm of point query and define the function of A-star in our system. We use the feature of data set to solve the point query algorithm in other way. The analysis of the point query algorithm presents in Chapter 4.

Chapter 4 Analysis and Evaluation

In this thesis, we present a closest point finding algorithm for point query and an A-star based path finding algorithm for path query. In this section, we will evaluate the time complexity of our proposed algorithms with other methods.

4.1 Closet Point Finding Algorithm

In the closest point finding algorithm, we organize the points into groups and represent the groups by kd-tree. The algorithm is an improved kd-tree algorithm for the virtual tour system. To evaluate the time complexity of the algorithm, we compare our algorithm with the traditional kd-tree algorithm.

In the closest point finding algorithm, we use the distribution of data to develop the method. We use the nearest neighbor query of kd-tree to enhance the speed of the closest groups. Before we discuss our closest point finding algorithm, we discuss the processing of the nearest neighbor query of kd-tree.

Recall the query processing of kd-tree. There have two parts in the query processing of kd-tree. First part is the search hierarchy of branch and bound which various tree-base data structure for metric space. First part of query processing traverses the tree to find the rectangle leaf containing the query point. The data point in the rectangle leaf is not always the nearest neighbor. The other part of query processing checks the parent of the rectangle of the query point and his sibling. In Table 4-1 show that the time complexity of the nearest neighbor of kd-tree.

Table 4-1 The time complexity of nearest neighbor of kd-tree

The time complexity Binary partition log n

Range Check P is the number of range check

Generally the radon distribution of data shows that the number of range check is a constant number. Hence the time complexity of nearest neighbor of kd-tree is log n.

The bad distribution of data which is similar to a circle or a rectangle shows that almost all nodes to be inspected in Figure 4-1.Hence the bad distribution is a cause of worst case of nearest neighbor query and the time complexity is n.

Figure 4-1 The two distribution of a kd-tree [7]

Although the distribution of data is not similar to a circle or a rectangle in this thesis, the distribution of groups divided the space to several polygons. Hence the number of range checks equals the number of the members of three or four group. The number of the member of three or four group is large than the number of range check in the radon distribution of data. Table 4-2 described the cost of operation in three cases.

Table 4-2 the time complexity of kd-tree

Binary partition Range Check Total cost

Average case log n P log n + P

Worse case log n n n

In this thesis log n P(40~120) log n + 120

In the thesis, the number of point is n and the number of the member of a group is 10~30. Hence the number of groups isn10~n30. The process of our method has four parts. First part is group classify which classifies the point data as line segment.

In data collected stage, we use GPS logger which records the points which we take the pictures of street. A track for each path traversing, a position of GPS device and weather influenced the result of locate processing of satellites. When we analysis the data of GPS logger, there have some error which produced by locate processing of satellites.

Figure 4- 2 A figure of the groups

Figure 4-2 show that the result of the group classified method. A red line segment is a group. The midpoint of line is the group index. We can find that groups divided the space to several polygons. The group indexes are evenly distributed in the space.

In second part, we use the result of group classified be the element of kd-tree.

The distribution of group indexes is similar to radon distribution. In this stage, we need to find the group which contains the closest point. The distribution of group indexes is suitable for used the nearest neighbor query of a kd-tree. As suggested above, the member of the closet group is not always the closest point. A query point is the center of a circle and a radius is the distance between a query point and the closet group. The closest point must locate in the circle. Hence we select a list of candidates of groups which contains the closest point. The method of select a list of candidates of groups is k nearest neighbor query of a kd-tree. The number of candidates of groups is three. This can be seen in the following in Figure 4-3.

A query point

Figure 4-3 The typical shapes of urban roads

In Figure 4-3, the result reflected indicates that the position of the closest point is on one of the three nearest groups. In this stage, the number of candidates of groups is small for reduce the cost of projection point. Although there may be have the shapes of polygon is not satisfy the rule of candidates of groups, the probity of this situation is small.

In last step of query processing, the projection point method which decided the

closet point of a group. The steps of the projection point are described as follows.

First step show that measure projection point. Then measure the location of the closest point on a group. Final step is calculated the minimal distance base on the location of the closest point in step 2. The cost of each projection method is five operations.

There have do projection method in 5K times because the number of candidates of groups is K.

Table 4-3 The compare with kd-tree

Binary partition Range Check Projection method

Total cost

Point query log m H 5K log n + H+5K

kd-tree log n P(40~120) 0 log n + P

In Table 4-3, H is the number of range check of the closest group. The distribution of group indexes shows that H is similar to the average case of nearest neighbor query of a kd-tree. Hence H is a constant number. The cost of the projection method of the point query is 15(5*3) in this thesis. 5K is a fixed number.

Table 4-4 The compare with kd-tree, m is the number of group

Binary partition Range Check Projection method Total cost

Point query log m H 5K(15) log m +5K

kd-tree log n n 0 n

In Table 4-4, the distribution of data is similar to a rectangle. The distribution is a bad distribution to the nearest neighbor query of kd-tree [7]. That is worst case of kd-tree

in Figure 4.1. The processing of point query will be group classified. The number of groups is m. Then K nearest neighbor query of groups kd-tree. Final step is projection method. We find that the cost of point query is log m +5K. The result shows that the distribution is similar to a set of polygon is suitable for use point query.

4.2 Path Finding Algorithm

The other part of this thesis is path finding. In this thesis we use A-star algorithm which is a search algorithm in AI. A-star which is a flexible algorithm is suitable for use to solve problem. The definition of the f function is described in Chapter 3. Although Dijkstra’s algorithm is one of shortest path algorithm, the size of spanning tree is larger than A-star in fixed gird. In this thesis, path finding is a function which connects several points with a path. There have a shortest path between the node i and the node i+1. Hence we can find a path which connects from node 1 to node n. There {node 1,…, node n} is the order of user selected. For each time path finding is (U-1)*2* Qp * Pr, where U is the number of user-chosen point, Qp

is the time of the closest point and Pr is the time of A-star algorithm. The time of A-star is L*B, where L is the number of the level of goal node in spanning tree and B is the number of branch. Because the distribution of group is similar a net, A-star will select the best path for each time. There is faster than branch first search. The time of branch first search is L^B.

The reason of A-star working is the heuristic function in A-star algorithm. In this thesis, we use the distance of two points on the Earth to be the heuristic function of A-star. The heuristic function must has an importance property which is the estimation distance of a heuristic function does not larger than the real distance of two point. In Chapter 3, h(n) estimates the distance of two points which is the shortest

distance between two points. If two points have a path, the cost of a path is larger than the direct distance between two points. It two points do not have a path; the cost of the path is infinity. The property of heuristic function is satisfied.

Table 4- 5 Compare algorithm of path finding

Name The information of start node The information of goal node

DFS No No

BFS No No

Greedy BFS No Yes

Dijkstra’s algorithm Yes No

A* algorithm Yes Yes

Table 4-5 shows that A-star is a better algorithm in path finding. Because the algorithm consider the information of start node and goal node. Hence, A-star algorithm can do the best choice for each path spans out. Although the path finding algorithms create by different authors, there has the relationship between for each other in Table 4-6.

Table 4- 6 The relationship of path find algorithms

Name g(n) h(n)

BFS 0 0

greedy BFS 0 Define

Dijkstra’s algorithm Define 0

A* algorithm Define Define

Table 4-6 shows that A-star is a general algorithm of others. When g(n) = 0 and h(n)= 0, A-star is BFS. When g(n) = 0 and h(n) defined , A-star is Greedy BFS. If h(n)

= 0,then A-star is Dijkstra’s algorithm. If the weight of edge = 1, A-star is greedy BFS.

Figure 4-4 An example of greedy BFS

Figure 4-4 shows that the path of the greedy BFS is node A to node B to node G.

the cost of the path is 501. The cost of the shortest path is 150. Hence the greedy BFS does not guarantee the path which finding by the greedy BFS is the shortest path. The path of the greedy BFS does not guarantee that the shortest path. The path of BFS also does not the shortest path.

Figure 4- 5 An example of Dijkstra’s and A-star

Figure 4-5 shows that if the cost of start node to node B and the cost of start node to node C equal. Dijkstra’s algorithm selects node B or node C to span out.

A-star will estimate the heuristic function of node B and node C. A-star will select the minimal cost of node B and node C. Hence the size of spanning of A-star will is smaller than Dijkstra’s algorithm. In the fixed gird, the cost of each node is one unit.

The spanning tree of Dijkstra’s algorithm is similar to the ripple of water and the spanning tree of A-star is similar to the flow from start node to the goal.

In this thesis, we present a method of point query and define the function of A-star. The feature of data provides the other view to design the query operation. The feature of data which groups indexes is suitable for the algorithm of A-star.

Chapter 5 Conclusions and Future work

In this thesis, we present an algorithm for the closest point query. First, a grouping algorithm groups the points of a street using corner detection. By grouping pictures of a street together, the efficiency of the point query algorithm is enhanced.

Since a street map is a collection of polygons, the closest point must be in the top 3 closest groups. In addition, we use the A-star algorithm to find the shortest path between two user-chosen points. We define the g function, which is a grand total cost form start node to current node, and a heuristic function h, which is a straight line distance function.

For the closest point query, the time complexity is logm + H +15. The limitation of the closest point query is that the point data must be similar to the collection of line segments. The time to find a path is (U-1)*2* Qp * Pr, where U is the number of user-chosen point, Qp is the time to locate the closest point and Pr is the time of A-star algorithm. The time complexity of A-star is the level of spanning tree times the number of branches.

There are several functions that can enhance the efficiency of this system.

Character recognition can be used to extract texts on the shop sign of street panorama.

Based on the texts extracted, one can add on top of the shop signs hyperlinks and telephony URIs of the shops. Web mining can also be used to gather shop-related information from the Internet, and link the information to the shop signs of our system.

We anticipate that more user interaction functions can be provided in this system.

References

[1] Donald E. Knuth. The Art of Computer Programming. Volume 3. Second Edition Reading, Massachusetts: Addison-Wesley, 1998.

[2] Raphael Finkel and J.L. Bentley. Quad Trees: A Data Structure for Retrieval on Composite Keys. Acta Informatica 4 (1), 1974.

[3] Jon Louis Bentley, Multidimensional Divide and Conquer, Communications of the ACM, 1980

[4]E. W. Dijkstra: A note on two problems in connexion with graphs. In Numerische Mathematik, 1 ,1959

[5] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.

Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001.

Section 24.3: Dijkstra's algorithm, pp.595–601

[6] Aimt J. Patel. Amit's Thoughts on Path-Finding and A-Star.URL:http://theory.stanford.edu/ amitp/GameProgramming/, 2005

[7] Andrew Moore, Efficient Memory-based Learning for Robot Control,: An intoductory tutorial on kd-trees, Computer Laboratory, University of Cambridge, 1991.

[8] Google map api, URL: http://maps.google.com.tw, 2008 [9] Google Code, URL: http://codes.google.com.tw, 2008

[10] Hanan Samet, Foundations of Multidimensional and Metric Data Structures, Addison-Wesley, 1990.

在文檔中地圖結合與互動全景街道系統 (頁 31-0)