Chapter 1 Introduction
1.5 Thesis Organization
The remainder of this thesis is organized as follows. In Chapter 2, we introduce the configurations of the proposed system and the system processes. In Chapter 3, the proposed method for learning guidance parameters and navigation paths are described.
In Chapter 4, we describe the proposed navigation strategies, including ideas, guidance techniques, and detailed navigation algorithms. In Chapter 5, the two proposed new space line detection techniques are introduced and their applications for natural landmark detection are described; and in Chapter 6, their applications for artificial landmark detection are described. In Chapter 7, we show some experimental
Chapter 2
System Design and Processes
2.1 Ideas of System Design
As mentioned in Chapter 1, many good facilities for the blind people have been proposed in the past. Using a vision-based autonomous vehicle as a machine guide dog is a good idea because the trainer does not have to spend much time to train it and it can work all day without taking a rest. If the vehicle is equipped with a camera, it will be able to “see” the environment around and avoid obstacles along the way.
However, the task of combining the camera and the vehicle system is not easy to accomplish. We need a control unit which connects the camera and the vehicle system, analyzes the acquired image data, integrates all the information, and makes decisions.
In this chapter, we will describe in Section 2.2 the software and hardware systems of the proposed machine guidance dog which accomplishes the above-mentioned tasks, and the detail of the proposed method for 3D data acquisition using the camera will be described in Section 2.3.
Because originally the vehicle does not have “knowledge” to navigate on the sidewalk, it will cause accidents, like collisions with obstacles or falling outside the sidewalk. Therefore, before the machine guide dog can navigate by itself, we should
“teach” it to know the outdoor information and deal with different conditions. In addition, strict strategies for navigation should be designed to protect the blind people and the vehicle from accidents. Finally, the vehicle should be designed to be capable of navigating along a learned path again and again. To reach the above goals, we have to organize the system processes for the autonomous vehicle system well. The system
processes will be described in Section 2.4, including the learning process in Section 2.4.1 and the navigation process in Section 2.4.2.
2.2 System Configuration
To construct the proposed system, we adopt a Pioneer 3-DX vehicle which is made by MobileRobots Inc. The vehicle is equipped with an imaging system composed of a stereo omni-camera. The imaging system is not only part of the vehicle system but also plays an important role of accumulating the information data and locating the vehicle. The autonomous vehicle and other associated hardware equipments will be introduced in Section 2.2.1, and the camera system will be described in Section 2.2.2. Besides the hardware, software is needed to provide a friendly interface to users in order to control the vehicle conveniently. The software system we develop for use in the study will be described in Section 2.2.3.
2.2.1 Hardware configuration
The hardware architecture of the proposed machine guide dog is shown in Figure 2.1. It can be partitioned into three principal systems: the vehicle system, the camera system, and the control system. We will describe these systems, respectively, in the following.
In the vehicle system, the Pioneer 3-DX as mentioned is shown is Figure 2.2, which has a 44cm×38cm×22cm aluminum body with two 19cm wheels and a 7cm caster. It can reach a speed of 5.76 kilometer per hour on flat floors; has the maximum rotation speed of 300 degrees per second; and can climb up an incline with the largest slope of 30 degrees. Moreover, the vehicle has sixteen ultrasonic sensors. They are
three 12V rechargeable lead-acid batteries and can run 18-24 hours if the batteries are fully charged initially. Furthermore, the vehicle is equipped with an odometer which records the pose of the vehicle, including the position and the orientation with respect to its initial pose, for each navigation cycle. The odometer provides also the readings of the vehicle speed, the battery voltage, etc.
(a)
(b)
(c)
Figure 2.1 Three different views of the used hardware architecture, which includes a vehicle and a stereo camera. (a) A 45o view. (b) A front view. (c) A side view.
The second part of the system hardware is the camera system. It is a two-mirror omni-camera which consists of one perspective camera, one lens, and two reflective mirrors of different sizes, all integrated into a single structure. A picture of the camera
2.4. The camera is of the model ARCAM-200SO, which is produced by ARTRAY Company with the size of 33mm×33mm×50mm and the resolution of 2.0M pixels.
The detailed specifications of the camera are listed in Table 2.1. The lens is produced by Sakai Co. and has a variable focal length of 6-15mm. The two reflective mirrors are produced by Micro-Star International Co. The structure of the camera system will be described in more detail in the next section.
In the control system, we utilize a laptop computer as the main unit. It is of model R840 produced by TOSHIBA Computer Inc. as shown in Figure 2.5. We use an RS-232 to connect the laptop computer and the autonomous vehicle and use a USB to
(a)
(b)
Figure 2.2 The Autonomous vehicle, Pioneer 3-DX, produced by MobileRobots Inc., used in this study.
(a) A back view. (b) A front view.
in Table 2.2.
(a) (b)
Figure 2.4 The used camera and lens. (a) The camera of model Arcam-200so produced by ARTRAY Co. (b) The lens produced by Sakai Co.
Table 2.1 The specification of Arcam-200so.
Size 33mm×33mm×50mm
CMOS Size 1/2” (6.4 × 4.8mm)
Mount C-mount
Max resolution 2.0 M pixels
Frame per second with max resolution 8 fps
Figure 2.3 The camera system used in this study.
Figure 2.5 The laptop computer of model TOSHIBA R840 used in this study.
2.2.2 Structure of used two-mirror omni-camera
In this section we will introduce the two-mirror omni-camera we use in this study.
As shown in Figure 2.6, a space point G is projected by the two mirrors onto the image plane of the camera system. The light ray coming from point G is reflected by the two mirrors to go through the lens center. The two mirrors are both made to be of the hyperboloidal shape. We will call the big mirror Mirror B, and the small one Mirror S, respectively, in the sequel of this thesis. As is well known, the hyperboloidal shape has two focal points: one being the focal point of Mirror S which is denoted by fs and the other the focal point of Mirror B which is denoted byfb subsequently. The configuration of the two mirrors is designed in such a way that the focal points of the two mirrors are located at an identical point which is just the lens center fc of the camera. Besides, the distance from the mirror center of Mirror B to the mirror center of Mirror S has a length of 20 cm according to our manual measurement in this study.
We call it the baseline. The detailed information of the two hyperboloidal-shaped
Table 2.2 Specification of the laptop computer.
CPU Intel Sandy Bridge Core i5-2410M 2.3GHz
RAM 4G DDR 1333MHz
GPU AMD Radeon HD 6450 /1024MB
HDD Size 640 GB
Table 2.3 Specifications of the used two hyperboloidal-shaped mirrors.
Radius Parameter a Parameter B
Mirror S 2 cm 2.41 cm 4.38 cm
Mirror B 12 cm 11.46 cm 9.68 cm
In spite of having two focal points, the hyperboloidal shape has another property as shown Figure 2.7: if a light way goes through one of the focal point, it will be reflected to go through the other focal point by the mirror. This property has been utilized to construct the omni-camera in a previous study [22] . According to this property, a space point G will first go into the centers of the two mirrors, then reflected by the mirrors to go through the lens center fc , and finally projected onto the CMOS sensor of the camera. Therefore, we have two distinct image points corresponding to the single space point G. Based on such a phenomenon, we can compute the range data of G. The detail will be described later in this chapter.
In addition, initially we place the two-mirror omni-camera in such a way that the axis going through Mirror S and Mirror B is perpendicular to the ground, as illustrated in Figure 2.7. However, it was found out in [22] that in the resulting fields of view (FOV’s) of mirrors B and S, the overlapping area on the ground is too small to be useful for computing precise range data. In this study, it is desired that the FOV is as large as possible. To solve this problem, the camera system is slanted for an angle of
as shown in Figure 2.8. It can be seen that the overlapping region is now bigger than before.
Mirror B
Mirror S
fb
fs
fC
fb
fs
fc
G
CMOS sensor Mirror B
Mirror S
Lens Baseline
Figure 2.6 An illustration of the two-mirror omni-camera and a space point projected on the CMOS sensor of the camera.
Mirror B
Mirror S
(a)
Mirror B
Mirror S γ
(b)
Figure 2.8 Two different placements of the two-mirror omni-camera on the vehicle and the region of overlapping. (a) The optical axis going through the two mirrors is parallel to the ground. (b) The optical axis through the two mirrors is slanted up for an angle of.
2.2.3 Software configuration
MobileRobots Inc., which provides the autonomous vehicle for use in this study, provides an application interface, called ARIA (Advanced Robotics Interface Application), for the user to control the vehicle. The ARIA is an object-oriented interface which can be used under the Linux or Win32 operating system using the C++
language. Therefore, we can use the ARIA to communicate with the embedded sensor system in the vehicle and obtain the information which the vehicle offers to control the position of the vehicle.
For the camera system, the ARTRAY provides a tool which is called Capture Module Software Developer Kit (SDK). It is an object-oriented interface and its application interface is written in several computer languages like C, C++, VB.net, C#.net and Delphi. We use the SDK to capture image frames with the camera and
system, we use Borland C++ Builder 6 with updated pack 4 on the Windows XP operating system. The Borland C++ Builder 6 is a GUI-based interface development environment (IDE) software. It is convenient for us to provide a friendly interface for the user.
2.3 3D Data Acquisition by the Two-mirror Omni-camera
2.3.1 Review of imaging principle of two-mirror omni-camera
In this section, we review the two-mirror omni-camera proposed in Huang and Tsai [22] and used in this study, as well as the formulas for range data computation using images captured by such a camera system. First, we review the image projection principle of an omni-camera. As shown in Figure 2.9, we use the two coordinate systems, the image coordinate system (ICS) and the camera coordinate system (CCS), to illustrate the principle of imaging process. The image coordinate system is a two-dimensional U-V coordinate system and the other is a three-dimensional X-Y-Z coordinate system. The origin of the first one is the center of the omni-image, and the second is the focal point of the hyperboloidal-shaped mirror. As mentioned previously, a light ray G at (x, y, z) in the CCS go through the focal point of the hyperboloidal-shaped mirror Om, and reflected by the mirror. Then, it goes through the other focal point at center of lens Oc. Finally it is projected onto an image point I on the omni-image plane at (u, v). As a result, each image point in an omni-image can be specified by an elevation angle and an azimuth angle After the azimuth angle
space point G.
O
m(0, 0, 0)
G(x, y, z)
Omni-image Y
X Z
V U
O
cI(u, v)
Figure 2.9 Imaging principle of a space point G using an omni-camera.
2.3.2 Derivation of formulas for 3D data acquisition
In this section, we will introduce the principle of the computation of the 3D range data. We will now define the direction of the camera coordinate system (CCS) CCSlocal in Figure 2.10. As seen in the figure, two light rays from G in CCSlocal go through the center of Mirror S and that of Mirror B, and1 and2 are the elevation angles, respectively. The points Os, Ob, G form a triangle OsObG which is illustrated in Figure 2.11. We know the distance from Os to Ob by manual measurement which is called the baseline in the last section. We can derive the following equations by the law of sines based on the geometry:
sin(90o b) sin( b)
d b
. (2.1)
Then, we can calculate accordingly the baseline as follows:
sin(90 ) actually are equal, which we denote by . From Figure 2.12, the azimuth in the ICS can be computed by using the image coordinates (u1, v1) of G according to the following equations:
Ob
Figure 2.11 An illustration of the relation between a space point G and the two mirrors in the used camera. (a) A side view of G projected onto the two mirrors. (b) A triangle ObOsG used in deriving 3D data.
After obtaining the distance d by Equation (2.2) and the azimuth angle θ by Equation (2.3), we can compute the position of G, namely, the global coordinates (X, Y, Z), in
local
CCS by the following equations:
X = d × cosαa × sinθ,
Y = d × cosαa × sinθ,
Z = d × sinαa. (2.4)
d
Figure 2.12 An illustration of a space point G at coordinates (X, Y, Z) in CCSlocal.
As mentioned previously, the camera system we use in this study is not set parallel to the ground so as to enlarge the overlapping area of both mirrors; instead, it is slanted up for an angle of as shown 2.13. It is desired that the Z-axis of CCSlocal
could be parallel to the ground. We define another camera coordinate system CCS
2.4 System Operation Processes
2.4.1 Learning process
In the learning process, it is important for the autonomous vehicle to “learn” the selected path and conduct navigation automatically. In this section, we will describe the information which the vehicle should “memorize” in detail. Initially, the vehicle has to record where the selected path in the outdoor environment is. For this study, the experimental place is a sidewalk in the campus of National Chiao Tung University.
Because the vehicle navigates on the sidewalk, we can take the advantage of the sidewalk curb to implement a “curb following” function for vehicle guidance. It also helps the vehicle to calibrate the odometer precisely. In addition, the lighting condition is a concern in the outdoor environment. So, the different location information must be recorded in the navigation path.
Moreover, another odometer calibration method adopted in this study is via landmark detection. By this method, the trainer can choose the landmarks which should be learned in the selected path, and decide where to localize the vehicle by the learning landmarks. Next, landmark detection is accomplished by a space line
Z’ Y
Z
X CCS
CCS’
X’ Y’
Figure 2.13 The relation between the two camera coordinate systems CCS and CCSlocal.
detection technique in this study, which is described in Chapter 5. After collecting enough information of the learned landmark, some parameters for landmark detection and the position of each landmark can be recorded.
To facilitate the user to learn navigation paths, a user learning interface is designed for the trainer. They may use it to control the autonomous vehicle to construct the navigation path. Furthermore, after the vehicle detects a landmark, we provide a semi-automatic learning process to adjust the parameters which we sat initially for the trainer to deal with some varying conditions of the environment. Also, the trainer should establish the navigation rules in advance for the vehicle to follow in the navigation process.
At last, after leading the autonomous vehicle to the destination, the learning process is finished. The learned information is organized into a learned path which is composed of several path nodes with guidance parameters. We finally acquire the navigation path map which combined the landmark information and the environment information, and store in the disk. The entire learning process proposed in this study is shown in Figure 2.14.
2.4.2 Navigation process
In the navigation process, the autonomous vehicle can analyze the current location using various stored information obtained in the learned process and navigate to the next node on the learned path. The entire navigation process proposed in this study is shown in Figure 2.15.
In general, the autonomous vehicle analyzes the current environment node by node to navigate to the goal according to the learned information data retrieved from
too light, the vehicle can be “confused” by the image we got from the camera system.
According to the learning environment information, the system was designed to be able to adjust the exposure of the camera dynamically if necessary.
Besides, the autonomous vehicle always checks if any obstacle exists in front of the vehicle. As soon as an obstacle is found and checked to be too close to the vehicle, a procedure of collision avoidance is started automatically to perform collision avoidance. In addition, if the vehicle gets a node of “landmark detection,” the
autonomous vehicle will adjust the detection pose and load the parameters for landmark detection. If a landmark is found successfully, the landmark’s position is used to modify the odometer of the vehicle; if not, some strategy of recovering the landmark are started, such as changing the parameters for landmark detection or changing the pose of the vehicle to detect landmark successfully.
Vehicle navigation loop
Chapter 3
Learning Strategy for Automatic Navigation
3.1 Introduction
The purpose of the learning process for the proposed machine guide dog system is to create a path on a sidewalk to guide a blind person to a selected destination.
Before starting to learn a path, we have to do some works. First, at first we have to choose some landmarks for vehicle localization. Then, we have to calibrate the camera system. The third task is to infer some guidance parameters. At last, we should adopt a learning strategy to learn certain information about each selected landmark.
3.1.1 Selected landmarks in outdoor environments for this study
When the vehicle is in the navigation process, mechanic errors usually will accumulate up to cause imprecise odometer readings of the vehicle location and orientation. To solve such problems, in this study we adopt the approach of vehicle localization using landmarks. For this purpose, some objects should be selected as landmarks at first to conduct the localization work. Chou and Tsai [23] detected the light pole and the hydrant to localize the vehicle. In this study, we select instead some other natural and artificial objects as landmarks, which are commonly seen on sidewalks. Specifically, we select two types of natural landmarks, tree trunk and lawn
the same purpose are three types of artificial landmarks, namely, signboard, traffic cone, and stop line on roads, as shown in Figure 3.2. With more types of landmarks so selected, we can have more information along the way for localization, and we can then guide the autonomous vehicle to the destination more reliably. The detailed proposed methods for vehicle localization using landmarks will be introduced later in Chapters 5 and 6. In this chapter, we discuss the learning process for these landmarks and other information.
3.1.2 Camera calibration
As mentioned in Chapter 1, it is a complicated task to calibrate a camera’s intrinsic and extrinsic parameters. A space-mapping technique [24], called pano-mapping, is adopted instead in this study to “calibrate” the two-mirror omni-camera system used in this study. We will introduce the adopted technique in Section 3.2.
(a) (b)
Figure 3.1 Two types of natural landmarks selected for use in this study. (a) Tree landmark. (b) Lawn corner point.
To navigate in outdoor environments, a trainer of the proposed vehicle system should guide the system to learn and record some parameters of the sidewalk
To navigate in outdoor environments, a trainer of the proposed vehicle system should guide the system to learn and record some parameters of the sidewalk