Chapter 2 System Design and Processes
2.4 System Operation Processes
2.4.2 Navigation process
In the navigation process, the autonomous vehicle can analyze the current location using various stored information obtained in the learned process and navigate to the next node on the learned path. The entire navigation process proposed in this study is shown in Figure 2.15.
In general, the autonomous vehicle analyzes the current environment node by node to navigate to the goal according to the learned information data retrieved from
too light, the vehicle can be “confused” by the image we got from the camera system.
According to the learning environment information, the system was designed to be able to adjust the exposure of the camera dynamically if necessary.
Besides, the autonomous vehicle always checks if any obstacle exists in front of the vehicle. As soon as an obstacle is found and checked to be too close to the vehicle, a procedure of collision avoidance is started automatically to perform collision avoidance. In addition, if the vehicle gets a node of “landmark detection,” the
autonomous vehicle will adjust the detection pose and load the parameters for landmark detection. If a landmark is found successfully, the landmark’s position is used to modify the odometer of the vehicle; if not, some strategy of recovering the landmark are started, such as changing the parameters for landmark detection or changing the pose of the vehicle to detect landmark successfully.
Vehicle navigation loop
Chapter 3
Learning Strategy for Automatic Navigation
3.1 Introduction
The purpose of the learning process for the proposed machine guide dog system is to create a path on a sidewalk to guide a blind person to a selected destination.
Before starting to learn a path, we have to do some works. First, at first we have to choose some landmarks for vehicle localization. Then, we have to calibrate the camera system. The third task is to infer some guidance parameters. At last, we should adopt a learning strategy to learn certain information about each selected landmark.
3.1.1 Selected landmarks in outdoor environments for this study
When the vehicle is in the navigation process, mechanic errors usually will accumulate up to cause imprecise odometer readings of the vehicle location and orientation. To solve such problems, in this study we adopt the approach of vehicle localization using landmarks. For this purpose, some objects should be selected as landmarks at first to conduct the localization work. Chou and Tsai [23] detected the light pole and the hydrant to localize the vehicle. In this study, we select instead some other natural and artificial objects as landmarks, which are commonly seen on sidewalks. Specifically, we select two types of natural landmarks, tree trunk and lawn
the same purpose are three types of artificial landmarks, namely, signboard, traffic cone, and stop line on roads, as shown in Figure 3.2. With more types of landmarks so selected, we can have more information along the way for localization, and we can then guide the autonomous vehicle to the destination more reliably. The detailed proposed methods for vehicle localization using landmarks will be introduced later in Chapters 5 and 6. In this chapter, we discuss the learning process for these landmarks and other information.
3.1.2 Camera calibration
As mentioned in Chapter 1, it is a complicated task to calibrate a camera’s intrinsic and extrinsic parameters. A space-mapping technique [24], called pano-mapping, is adopted instead in this study to “calibrate” the two-mirror omni-camera system used in this study. We will introduce the adopted technique in Section 3.2.
(a) (b)
Figure 3.1 Two types of natural landmarks selected for use in this study. (a) Tree landmark. (b) Lawn corner point.
To navigate in outdoor environments, a trainer of the proposed vehicle system should guide the system to learn and record some parameters of the sidewalk environment for use in the navigation process. The parameters to be learned in this study include image segmentation thresholds and some other environment parameters.
The proposed technique for learning the adopted environment parameters will be described in Section 3.4.1. Also, the proposed method for learning the used image segmentation thresholds will be described in Section 3.4.2. Finally, a process which we propose to create the navigation path is described in Section 3.5.3.
(a)
(b)
(c)
Figure 3.2 Three artificial landmarks used in this study. (a) Tree landmark. (b) Road stop line landmark. (c) Signboard landmark.
3.2 Camera Calibration by Space-mapping Approach
To calibrate the camera system used in this study, we use the pano-mapping technique proposed by Jeng and Tsai [24]. Specifically, it is desired to establish a pano-mapping table to record the relations between the locations of image points and those of the corresponding real-world points. For this, as mentioned in Chapter 2, we assume first that a light ray going through a world-space point P with the elevation angle α and the azimuth angle θ is projected onto a specific point p at coordinates (u, v) in the omni-image. The pano-mapping table specifies the relation between the coordinates (u, v) of the pixel p in the image and the azimuth-elevation angle pair ( of the corresponding world-space point P. We construct the pano-mapping table once, and the table can be looked up to retrieve 3D information forever. Specifically, we establish two pano-mapping tables for Mirrors S and B, respectively, in the camera system used in this study. The details are described in the following algorithm.
Algorithm 3.1 Construction of pano-mapping tables.
Input: two sets of six landmark point pairs (pi, Pi) and (qj, Qj) selected in advance manually where pi and qi are points in an omni-image I and Pi and Qj are the corresponding points in the world space.
Output: two pano-mapping tables of dimension M× N for Mirrors B and S.
Steps.
Step. 1. Let the six known image pixels pi be located at coordinates (ui, vi,) in the Mirror B region in omni-image I and the six corresponding known
Step. 2. Similarly let the six known image pixels qj be located at coordinates (Uj, Vj,) in the Mirror S region in omni-image I and the six corresponding known world-space points Qj be at coordinates (Xj, Yj, Zj) in the camera coordinate system, where j = 1, 2, …, 6.
Step. 3. Calculate the radial distances ri and Rj in the image plane from the image pixels pi and qj to the image center, respectively, by the following equations:
2 2 2 2
resulting in six pairs of elevation angles for Mirrors S and B, respectively.
Step. 5. Under the assumption that the surface geometries of Mirrors S and B are radially symmetric in the range of 360 degrees, use two radial stretching functions, denoted as fS and fB, to describe the relationship between the radial distances ri and the elevation angles αi as well as that between Rj and βj, respectively, by the following equations where i = 1, 2, …, 6:
1 2 3 4 5
Step 4 as well as a numerical method to obtain the coefficients a0 through a5 and b0 through b5.
Step. 7. By the use of the function fB with the known coefficients b0 through b5, construct a pano-mapping table TB for Mirror B in a form as that shown in Table 3.1(a) according to the following rule:
for each world-space point Pij with the azimuth-elevation pair (θi, j), compute the corresponding image coordinates (uij, vij) by the following equations:
cos ; sin
ij j i ij j i
u r v r . (3.4)
Step. 8. In a similar way, construct a pano-mapping table TS for Mirror S by the use of the function fS with the known coefficients s0 through s5 in a form as that shown in Table 3.1(b).
Figure 3.3 Illustration of constructing pano-mapping tables in this study.
Table 3.1 Two pano-mapping tables used for the two-mirror omni-camera used in this study. (a) Pano-mapping table used for Mirror B. (b) Pano-mapping table used for Mirror S.
1 2 3 4 … S vehicle location and the navigation environment. The coordinate systems are illustrated in Figure 3.4 and defined in the following.
(1). Image coordinate system (ICS, denoted as u-v): The origin OI of the image coordinate system is located at the center of the image plane, and the u-v plane coincides with the image plane.
(2). Camera coordinate system (CCS, denoted as X-Y-Z): The origin OC of the CCS is located at the focal point of Mirror B. The X-Z plane is parallel to the ground and the Y-axis is perpendicular to the ground.
(3). Vehicle coordinate system (VCS, denoted as VX-VY): The origin OV of the vehicle coordinate system is located at the center of the autonomous vehicle, and the VX-VY plane coincides with the image plane.
(4). Global coordinate system (GCS, denoted as GX-GY): The origin OG of this system is always set at the start position of the vehicle in the navigation path, and the G -G plane coincides with the ground.
When the vehicle is moving in the navigation session, we have to know the relationships among the coordinate systems. It is advantageous to utilize an odometer to localize the vehicle position in the GCS, though the odometer readings are not very accurate all the time. At the beginning of each navigation process, the VCS and CCS follow the vehicle, and the VCS coincides with the GCS. After the vehicle moves for a short distance, as illustrated in Figure 3.5, and stops at a position V at world coordinates (Cx, Cy) with a rotation angle θ, we can derive the coordinate transformation between the coordinates (VX, VY) of the VCS and the coordinates (GX, GY) of the GCS by the following equations:
cos sin
Figure 3.4 Four coordinate systems used in this study. (a) The ICS. (b) The CCS. (c) The VCS. (d) The GCS.
Besides, the relationship between the CCS and the VCS is illustrated in Figure 3.6. Because the origin of the CCS which projects onto the ground does not coincide with the origin of the VCS, we have to provide the transformation function. As illustrated in Figure 3.7, there is a distance between the two origins, which we denote as Sy,on the Vy-axis. Thus, the transformation function between the CCS and the VCS can be derived by the following equations:
VX X; VY Z Sy. (3.6)
Figure 3.6 An illustration of the relation between the CCS and the VCS.
VY
Figure 3.5 A vehicle at coordinates (Cx, Cy) with a rotation angle θ with respect to the GCS.
3.4 Learning of environment and landmark parameters
3.4.1 Learning of environment intensity in windows
In the navigation process, we navigate the vehicle along the learned path. Because the aim is to navigate along the path again and again in this study, each landmark is usually projected onto a fixed region in the image. By this property, we can define regions of interest (ROIs) in the image as shown in Figure 3.7, which are also called environment windows. Some advantages can be obtained from this approach as follows:
1. we can reduce the computation time for detecting the desired landmark;
2. if a feature similar to the detected landmark appears in the environment, it is easy to distinguish the object from the noise.
Figure 3.7 A pair of environment windows for road stop line detection.
However, this property alone does not solve the problem totally in outdoor
influence the results of the environment analysis work as well. For instance, as shown in Figure 3.9(a), because of the overexposure duo to the lighting condition, the feature of the curb line is not obvious enough to be recognized. For this reason, we provide a system which is based on Chou and Tsai [23] for the trainer to adjust the exposure of the camera for the purpose of detecting the landmark successfully. When the system adjusts the exposure of the camera to a suitable value, it means that the landmark we want to detect can be extracted well in this condition. Then, we may record the image illumination parameter into the path information as part of the learned parameters. To be more specific, we learn a suitable image intensity, called environment intensity hereafter, on the image in the environment windows during the learning process. A detailed algorithm for the above process is described in the following.
Algorithm 3.2 Learning of the environment intensity parameter at a path node.
Input: a relevant set of environment windows Winset for a certain path node with a pre-selected landmark under the assumption that the vehicle arrives at the node currently.
Output: an environment intensity parameter Ien. Steps.
Step 1. Adjust the camera exposure and acquire a suitable image Iall.
Step 2. Check if the desired landmark feature is well imaged in the current illumination: if not, go to Step 1; otherwise, continue.
Step 3. For each pixel in image Iall with color (R, G, B) in winB of Winset, calculate its intensity value Yi by the following equation and record Yi into a set SY:
0.299 0.587 0.114
Yi R G B. (3.7)
Step 4. Calculate an average value Ien of all Yi in the following way as output by the use of the data in S where N is the size of win of Win :
1
1 N
en i
i
I Y
N
.(3.8)
Some examples of suitable illuminations for navigation tasks are shown in Figures 3.8(b), and 3.9(b). The environment intensity parameters learned in the above way for them will be recorded as part of the learning result of landmark detection described later.
(a) (b)
Figure 3.8 Two different illuminations for curb line detection. (a) An instance of overexposure. (b) A suitable case.
3.4.2 Learning of artificial landmark segmentation parameters
It is very important for us to localize landmarks in this study. Before landmark localization, we have to utilize some segmentation methods for image analysis. In this section, we introduce the segmentation parameters proposed for use in this study for artificial landmark segmentation. Regarding natural landmark detection, we utilize the moment-preserving thresholding proposed by Tsai [26] to conduct landmark
Also, the used landmark detection methods will be described in Chapters 5 and 6.
(1) For sidewalk curb segmentation we use the color information (hue and saturation) and the image thresholding technique to find the curb feature in the image utilizing the HSI color model first. Then we adopt the Canny edge
detection technique to extract the desired landmark shape. The thresholds for hue and saturation values are collected as a set of curb segmentation parameters.
Also, for the road stop line and the traffic cone, we conduct similar works.
(2) For signboard segmentation we use the HSI color model to extract the
signboard shape. The threshold values for hue and saturation and also the contour of the signboard described by the principal components obtained from principal component analysis are collected as a set of signboard segmentation parameters.
(3) For tree segmentation we use the moment-preserving thresholding as mentioned previously to extract the tree shape. The contour of the tree also described by principal components obtained from principal component analysis is collected as a set of tree segmentation parameters.
When conducting landmark learning, the trainer can detect a desired landmark by a user interface of the system, and adjust the values of the related set of segmentation
(a) (b)
Figure 3.9 Two different illuminations for signboard detection. (a) An instance of underexposure. (b) A suitable case.
parameters. After obtaining an appropriate result from the landmark detection process, the segmentation parameters and the learned landmark information which the trainer will use are recorded together as part of the learned path. The process for learning landmark segmentation parameters is shown in Figure 3.10.
3.5 Learning Processes for Creating a Navigation Path
In this section, we introduce the proposed method for learning a navigation path in the learning process. The method proposed is based on Chou and Tsai [23]. In the learning process, we use the odometer to localize the vehicle position and approximate the detected landmark position in general. The proposed strategy for learning landmarks for vehicle localization is described in Section 3.5.1. Additionally, there are some obstacles on the sidewalk along the way. The obstacles may block the vehicle. As shown in Figure 3.11, there is a hole on the sidewalk. It may cause the vehicle to fall outside the sidewalk. Thus, we propose a method to learn the positions of such obstacles along the way, called fixed obstacles hereafter. The method is described in Section 3.5.2. Finally, we introduce the entire proposed procedure to learn a navigation path in Section 3.5.3.
Adjust
Figure 3.10 The process for learning landmark segmentation parameters.
3.5.1 Strategy for learning landmark positions and related vehicle poses
We introduce the proposed strategy for learning a landmark and its position in this section. Simply speaking, for a landmark to be learned well, we have to guide the vehicle to appropriate positions to detect it. To increase the accuracy of the position of the learned landmark, we take images of the landmark a number of times from a number of different positions or different directions. The reason why we take multiple images is that the outdoor condition might cause the taken images to be all different, especially when there are clouds floating across the sun in the sky during the noon time. After we collect multiple images and analyze the feature data, a more precise landmark position with the corresponding vehicle pose can be obtained. Then, it is recorded as part of the learned navigation path.
To be more specific, after we detect the landmark in omni-images a multiple times with the vehicle in different poses, we can calculate the mean of all the detected landmark positions as an estimated landmark position, denoted as Plandmark. Furthermore, we choose the vehicle pose among the multiple ones, which is closest to the one to yield the estimated Plandmark, for use as the learned pose, denoted as Pvehicle,
Figure 3.11 A fixed obstacle in a navigation path which may cause the vehicle to fall outside the sidewalk.
corresponding to the estimated Plandmark. The detailed algorithm for the above process is described in the following.
Algorithm 3.2 Learning of the landmark position and related vehicle pose.
Input: a landmark type of the appointed landmark to be learned.
Output: an estimated landmark position Plandmark and a corresponding vehicle pose Pvehicle.
Steps.
Step 1. Initialize three parameters i, j and k to be zeros, where i, j and k represent the k-th landmark detection, the j-th vehicle orientation, and the i-th vehicle position, respectively.
Step 2. Guide the vehicle to a position Vi = (Pxi, Pyi) and record this vehicle position Vi into a set SV.
Step 3. Turn the vehicle into an orientation Thij and record this orientation into a set STh.
Step 4. According to the type of landmark, localize the landmark by the use of the corresponding localization technique (described in Chapter 5) to obtain the landmark position pijk = (xijk, yijk), and record this landmark position pijk into a set SL.
Step 5. Go to Step 4 for K times as needed, and record the number of recoded landmark positions in the j-th vehicle orientation and the i-th vehicle position, denoted as Nij = K.
Step 6. Go to Step 3 for J times as needed, and record the number of different vehicle orientations in the i-th vehicle position, denoted as Ni = J.
Step 7. Go to Step 2 for I times as needed, and record the number of the different
Step 8. Compute the desired landmark position Plandmark using the set SL by the distance to Plandmark computed in terms of the Euclidean distance.
Step 10. Choose a median orientation Thc from all Thca in STh, where a is 1 through Nc, and set the desired vehicle position Pvehicle as Pvehicle= (Pxc, Pyc, Thc).
3.5.2 Learning of fixed obstacles in a navigation path
In this section, we propose a function for use on the learning interface which can be used to learn fixed obstacles. It is based on Chou and Tsai [23]. When we guide the vehicle to a location where a fixed obstacle is projected onto the image region of both Mirrors S and B, we utilize this function to learn the fixed obstacle. We know that the fixed obstacle is located on the sidewalk, so we can use this property to learn the fixed obstacles more easily. As shown in Figure 3.12, first we use the mouse to click the position of the fixed obstacle on the region of Mirror B in the image. Then, the system will record the pixel in the image for use later to calculate the learned position of the fixed obstacle. After selecting sufficient obstacle points in the omni image, the position of the fixed obstacles Wobs and some parameters for avoiding the obstacles are recorded together as part of the learned path information. Finally, the trainer may
In this section, we propose a function for use on the learning interface which can be used to learn fixed obstacles. It is based on Chou and Tsai [23]. When we guide the vehicle to a location where a fixed obstacle is projected onto the image region of both Mirrors S and B, we utilize this function to learn the fixed obstacle. We know that the fixed obstacle is located on the sidewalk, so we can use this property to learn the fixed obstacles more easily. As shown in Figure 3.12, first we use the mouse to click the position of the fixed obstacle on the region of Mirror B in the image. Then, the system will record the pixel in the image for use later to calculate the learned position of the fixed obstacle. After selecting sufficient obstacle points in the omni image, the position of the fixed obstacles Wobs and some parameters for avoiding the obstacles are recorded together as part of the learned path information. Finally, the trainer may