Chapter 2 System Configuration and Processes
2.4 System Processes
2.4.2 Navigation Process
Before the vehicle starts to navigate, the system reads the path and environment information created in the learning process as mentioned previously. In order to guide the vehicle to navigate along the learned path, the vehicle is instructed to move from a node to the next sequentially according to the learned path. A flowchart of the proposed vehicle navigation process is shown in Fig. 2.4.
In more detail, when the vehicle is navigating to the next node, it checks the navigation mode at first to ensure whether it has to detect the curb line and followed it.
If the curb-line detection-and-following process fails, the system will enter the blind navigation mode and reconfirm the navigation mode in the next loop. Also, the navigation process detects the target landmark continually until the correct landmark appears in the omni-image. When the vehicle navigates to the desired node successfully, it can obtain the navigation information of the next node from the learned path kept in the system.
In addition, when the navigation process finds the target landmark successfully, the vehicle will adjust its position and load the relevant parameters for navigation to the next node. However, some nodes provide the navigation information only, which we call “tuning nodes.” This kind of node can help the vehicle to navigate to the terminal node successfully.
20
Figure 2.5 Flowchart of proposed learning process.
21
Vehicle Localization procedure Loop of navigation process
Start Navigation
Figure 2.6 Flowchart of proposed navigation process.
22
Chapter 3
Learning of Outdoor Environment Features
3.1 Introduction
In order to use an autonomous vehicle to navigate in an outdoor environment, building complete path information to guide the vehicle is necessary. Therefore, creating a path map and selecting appropriate landmarks is a primary work for successful security patrolling by vehicle navigation. In this chapter, we will introduce our ideas of selection of landmarks and learning of guidance parameters in outdoor environments. Some coordinate systems, including the image coordinate system, the camera coordinate system, the vehicle coordinate system, and the global coordinate system, will be defined in Section 3.2. In addition, the learning techniques and strategies will be described in Section 3.3. At last, a detailed algorithm describing the learning process will be described in Section 3.4.
3.1.1 Selection of Sequential Landmarks for Learning
When we conduct the vehicle navigation process, mechanic errors will accumulate to affect to the readings of the odometer about the vehicle location and orientation. To solve such problems, we adopt an approach of “vehicle localization using landmarks.” For this purpose, some objects should be selected as landmarks at first to conduct the vehicle localization task. In this study, we select some objects sequentially along the pre-selected path as landmarks. Because of this characteristic of
23
sequential selection, we can estimate the position of the vehicle on the sidewalk approximately without having to depend on using the odometer readings excessively.
The main types of selected landmarks for localization in this study include light pole, hydrant, and tree trunk. Two other types of landmarks, namely, ramp and curb, which provide environment parameters for vehicle guidance are also selected.
With more and more categories of landmarks selected, we can utilize more information along the path for vehicle localization to reduce the chance of getting astray or falling out off the sidewalk, and guide the autonomous vehicle to the terminal point more reliably as well. The proposed methods of vehicle localization using landmarks will be described in detail later in Chapters 5 and 6.
3.1.2 Idea of Learning Guidance Parameters and Landmark Features in Outdoor Environments
To navigate in an unknown outdoor environment, some kinds of environment parameters or features should be learned for use in the navigation stage. The first feature learned in the proposed system is navigation path data. We can obtain the position of the vehicle by the odometer reading, but the mechanic errors usually cause imprecise readings of the vehicle location. Therefore, it becomes an important task to correct the position of the vehicle and the odometer reading. In Section 3.3.2, we will describe how to collect path data for vehicle localization by controlling the vehicle to navigate along a pre-selected path in an outdoor environment.
The features to be learned next are some camera and vehicle guidance parameters. Part of the parameters need manual measurements and are taken as inputs to the process of learning other features, and we refer to such types of feature data as
“prior knowledge.” More details of such parameters for learning will be introduced in
24
Sections 3.3.2 and 3.3.3.
The last feature to be learned is landmark. In order to use the landmarks to conduct vehicle localization, “training” the vehicle to “know” what to detect and how to recognize landmarks are necessary. That is, the vehicle must learn what features about each selected landmark should be detected, and then, it should be able to recognize each landmark by matching its features against those computed in the navigation phase. For this purpose, we adopt in this study a powerful approach using the SURF [20] to extract such features from selected-landmark images. In the mean time, we also record the vehicle location with respect to each selected landmark in terms of depth data. The detailed learning process is described in Section 3.3.4.
3.2 Coordinate Systems
In this section, we will introduce the coordinate systems used in this study, which describe the relations between the used devices and the selected landmarks in the navigation environment. The coordinate systems are illustrated in Figure 3.4 and defined in the following.
1. Image coordinate system (ICS): denoted as (u, v). the u-v plane coincides with the image plane and the origin OI of the ICS is located at the center of the image plane.
2. Vehicle coordinate system (VCS) denoted as (VX, VY): the origin OV of the vehicle coordinate system is located at the center of the vehicle, and the VX-VY plane coincides with the image plane.
3. Camera coordinate system (CCS), denoted as (X, Y, Z): the origin OC of the CCS is located at the lens center of the KINECT device, the X-Z plane is parallel to the
25
ground, and the Y-axis is perpendicular to the ground.
4. Global coordinate system (GCS) denoted as (GX, GY): the origin OG of this system is always placed at the start position of the vehicle in the navigation path, and the GX-GY plane coincides with the ground.
When we conduct the vehicle in the navigation phase, we have to know the relationships among the coordinate systems. At the beginning of each navigation process, the VCS and CCS follow the vehicle and the VCS coincides with the GCS.
The coordinate systems are illustrated in Fig. 3.1.
Color and depth image
In this study, we use three KINECT devices equipped on the vehicle to sense the environment. When we bring the vehicle to a certain place where is a path node, the proposed system records the position data in the 3D space of the vehicle with respect to the selected landmarks. Then, when the vehicle moves on the path in the navigation
26
session, we can adjust the vehicle location according to the learned position at the currently-visited path node.
3.3 Learning of Outdoor Guidance Parameters and Landmark
Features
3.3.1 Learning of Outdoor Guidance Parameters
For the vehicle to navigate in an outdoor environment, a trainer of the proposed vehicle system should guide the system to learn and record parameters or features of the environment. The parameters to be learned in this study include depth data, landmark feature, detection window, KINECT device number, and some other ground-truth parameters. The proposed techniques for learning these environment parameters are described in the following.
3.3.2 Learning of Navigation Paths Composed of Nodes
In general, the vehicle navigates in an outdoor environment under the control of a user. And at each visited path node, normally the proposed system will take the odometer reading as the position data of the vehicle. The position data consist of the vehicle coordinates (VX, VY) and vehicle orientation in the VCS. We use these data to assist the vehicle system to conduct localization. Using the just-mentioned position data and the concept of sequential-node visiting to conduct vehicle localization is the main principle of vehicle guidance adopted in this study.
Specifically, we save the position data provided by the KINECT device, the
27
vehicle-turning parameters, and the vehicle coordinates (VX, VY) as the data of a node Ni while the vehicle is in one of the following two situations:
1. the user controls the vehicle to learn a landmark object;
2. the user controls the vehicle to turn and record the turning parameters.
In addition to containing data items mentioned above, each node is labeled with a serial number. Such nodes then form a graph of the learned path. When the user controls the vehicle to move to a desired position, he/she will instruct the vehicle system to collect the node data semi-automatically. When the learning stage is finished, the system will have a set of nodes, denoted as Npath. The process of recording the path data is described as an algorithm in the following.
Algorithm 3.1 . Path node recording.
Input: The 3D data provided by the KINECT device and the coordinates provided by the odometer.
Output: A set of path nodes Npath ={N0, N1, N2, …, Nt}.
Steps:
Step 1. Record the initial position of the vehicle (x0, y0) = (0, 0) into the first node N0 of the set Npath and mark the node as N0.
Step 2. Create node Ni into the set Npath, record the reading values of the odometer, (xi, yi), into Ni when the vehicle is in either of the following situations:
Step 2.1. the user controls the vehicle to learn an object of landmarks;
Step 2.2. the user controls the vehicle to turn and input the turning parameters.
Step 3. Repeat Step 2 until the learning process is finished.
Step 4. Create the terminal node Nt into the set Npath.
Step 5. Save all the nodes of the set Npath into the computer and create a path map.
28
We show an illustration of the path in our experimental environment for this study in Fig. 3.2, which is part of the sidewalk of National Chiao Tung University. All of the nodes shown are recorded by the user. Each node is labeled with an index number according to the order of learning. The index numbers are useful for path map creation and landmark detection.
Figure 3.2 An illustration of the learned path nodes in the experimental environment for this study (part of the sidewalk in National Chiao Tung University).
29
3.3.3 Learning of Landmark Detection and Ground-truth Parameters
In this study, we use a technique of line following to navigate the vehicle along a path on the sidewalk which has a curb line along the path. Therefore, we can find that a line-segment landmark is usually projected in a fixed region in the image. For this characteristic, we only need to detect a part of the region in the image to reduce the computation time. Accordingly, we can define a region of interest (ROI) in the image as shown in Fig. 3.3, which is also called a detection window.
By this property, we also record which KINECT device is used to detect a certain landmark along the path. The KINECT devices are labeled with a serial number as shown in Fig. 3.4. When the vehicle moves on in the navigation stage, the recorded serial number in a path node can be retrieved to decide which KINECT device should be used to detect the target landmark continuously until the target landmark is detected. This means that the computation load in the navigation process is considerable. But after relevant parameters are learned, we can handle less data acquired by the specified KINECT device, and so do not have to use more than one KINECT device to detect the landmark at the same time unless we want. In this way, we can speed up the computation and so increase the navigation speed.
In addition, some ground-truth data are measured in the learning process, such as the angle of any ramp and the distance of the vehicle to the curb along the sidewalk.
We will describe in Chapter 6 how these parameters are used in this study.
3.3.4 Learning of Landmark Features in Color and Depth Images
In order to learn selected landmarks, we design a user interface to help users to
30
specify the landmark which they want to use. While a user controls the vehicle to a position beside a landmark to be learned, he/she can select one of the KINECT devices to acquire the color and depth images, and then drag manually a rectangle as an ROI to segment out the landmark which appears in the color image. Next, a SURF extraction algorithm [20], which is described in Chapter 5 in detail, is applied to obtain the feature set of the ROI. Then, the depth data provided by the KINECT device, the feature set of the landmark, the KINECT device number, and the ROI are saved into the learned data set. A flowchart is illustrated in Fig. 3.5, and the details of the process are described in the following as an algorithm.
Figure 3.3 Curb line in the detection window. (a) Color image. (b) Depth image.
Vehicle
KINECT device No . 1
No . 2
No . 3
Figure 3.4 An illustration of KINECT device numbers.
(a) (b)
31
Figure 3.5 A flowchart of the landmark learning process.
Algorithm 3.2 . Learning of a selected landmark.
Input: the position P of a selected landmark M.
Output: information data of landmark M.
Steps.
Step 1. Control the vehicle to position P beside the landmark M.
32
Step 2. Select one of the KINECT device as specified in the path to acquire a color image I and a depth image D.
Step 3. Drag a rectangle on image I as an ROI R.
Step 4. Apply the SURF extraction algorithm on the ROI to extract a feature set S.
Step 5. Save the depth image D, the KINECT device number, the feature set S, and the ROI R manually in the record of the current path node corresponding to landmark M.
In this study, gray-level depth images composed of depth data provided by the KINECT device are used as inputs to the SURF extraction algorithm. An example of such depth images is shown in Fig. 3.6. Actually, the above algorithm of learning of a landmark is not suitable for such a type of depth image because the feature points in a gray-level image are much less than those in a color image. However, our experimental experience of using the depth image to extract SURF’s for landmark localization shows that the effect of using the depth image alone is acceptable. More detailed experimental results and vehicle navigation schemes will be described in Chapter 6.
Figure 3.6 A hydrant landmark in a depth image.
33
Chapter 4
Navigation in Outdoor Environments
4.1 Introduction
When the learning process is finished, we can obtain the learned environment information, including a set of landmark features, ground-truth data, images of ROI, and a navigation path. In this chapter, we introduce our idea for vehicle navigation by this information in outdoor environments, and describe how we implement them.
Some strategies for conducting the navigation work will be described in Section 4.2.1.
In Section 4.3, the detailed algorithm for the proposed navigation process will be introduced after two main ideas to guide the vehicle to navigate on the learned path are described.
4.1.1 Strategy of Vehicle Guidance on Learned Paths
In the task of vehicle navigation, a navigation path like that shown Fig. 3.2 is established in advance. There are a starting point and an end one in the path, and also some spots of interest to us that the vehicle will go through between the starting point and the end one. In this study, we have chosen a starting point and an end one on an interesting path in a part of the sidewalk in National Chiao Tung University as our experimental environment, and record the features and positions of some pre-selected landmarks along the path. We have also “learned” some environment parameters, like the speed of the vehicle, the angle of each path turning, and the ground-truth data of a ramp and a curb segment, to assist the vehicle to navigate along the path successfully, as shown in Fig. 4.1. When the above-mentioned tasks are finished, the vehicle will
34
be said to be able to navigate along the learned path.
However, besides guiding the vehicle to learn the above-mentioned parameters, a vehicle navigation strategy is also important in this study. The strategy proposed in this study to conduct the navigation work is introduced in Section 4.2. The detailed algorithm for the proposed navigation process is introduced in Section 4.3.
(a) (b)
Figure 4.1Two types of landmarks selected for use in this study. (a) Curb line. (b) Ramp.
4.1.2 Localization by Sequential Landmarks
As mentioned previously, the vehicle navigation process usually generates mechanic errors, resulting in imprecise computations of vehicle positions. To solve the problem, a strategy adopted in this study is to guide the vehicle to constantly localize its position based on the sequentially learned landmarks. Specifically after detecting and localizing a landmark in the acquired KINECT images by the use of the proposed methods (introduced later in Chapter 5) and obtaining the relative vehicle position with respect to the landmark, we can adjust the vehicle’ position and orientation to the status as that learned in the learning phase at the current spot.
In addition, because the learned path is along a sidewalk and we use the concept of sequential-node visiting to conduct vehicle localization, the use of a curb line
35
feature on the sidewalk is practical in this study. We use the learned curb-line parameter to achieve line following to correct the vehicle’s orientation for navigation along the learned path on the sidewalk.
4.2 Proposed Navigation Process
4.2.1 Strategies for Proposed Navigation Process
In this section, we introduce the strategies proposed in this study for vehicle navigation on the learned path. At first, the navigation process reads a learned navigation path and related guidance parameters which were recorded in the storage of the laptop computer. The navigation path consists of several nodes which were labeled in a sequential order in the learning process. The vehicle is guided according to the concept of sequential-node visiting to visit each node sequentially to conduct vehicle localization. Some strategies are proposed for use to guide the vehicle to navigate to the pre-selected destination successfully. They are described as follows.
1. The vehicle always follows the curb line on the sidewalk if possible. After detecting the curb line, the vehicle modifies its orientation to keep a safe distance with respect to the curb line on the sidewalk.
2. The vehicle localizes its position according to the learned sequential landmarks along the path. We adjust the vehicle’s position in the GCS according to the learned landmark position and the current landmark position which are computed using the acquired images at the vehicle’s current location.
3. An object detection process is conducted continuously to detect objects around.
When an object of suspicion appears in the detection window, the vehicle will stop going forward, and match it against the recorded landmark.
36
By the above strategies, the vehicle can be expected to navigate to the desired destination successfully. A flowchart in accordance with the above three strategies is shown in Fig. 4.2.
4.2.2 Idea of Vehicle Localization by Learned Sequential Landmarks
Although the odometer readings provide the vehicle’s position and direction for vehicle navigation in the navigation phase, they are usually imprecise to guide the vehicle to the next position correctly. Therefore, using the learned landmarks, which include light poles, hydrants, sidewalk curb lines, and tree trunks in this study, to
Although the odometer readings provide the vehicle’s position and direction for vehicle navigation in the navigation phase, they are usually imprecise to guide the vehicle to the next position correctly. Therefore, using the learned landmarks, which include light poles, hydrants, sidewalk curb lines, and tree trunks in this study, to