Thesis Organization - 有全方位監控之多輛電腦視覺自動車的最佳安全巡邏之研究

Chapter 1 Introduction

1.5 Thesis Organization

The remainder of this thesis is organized as follows. In Chapter 2, we describe the system configuration of the vehicles used as a test bed in this study, as well as the principle of vehicle learning and guidance. In Chapter 3, the proposed techniques for camera calibration, acquiring information about environments, and patrolling tasks are described. In Chapter 4, the proposed methods for performing security patrolling and using top-view omni-cameras to localize and monitor vehicles are described. In Chapter 5, the proposed method for planning paths that are optimal, random, and load-balanced for all autonomous vehicles is described. In Chapter 6, the proposed method for collision avoidance between vehicles in patrolling sessions is described.

Some experimental results are shown in Chapter 7. Finally, some conclusions and suggestions for future works are given in Chapter 8.

Chapter 2 System Configuration and Navigation Principles

2.1 Introduction

For security surveillance, the utilization of a vision-based autonomous vehicle is good for saving manpower. The vehicle is dexterous, with its moving ability increasing the view range of security surveillance. Besides, it can also monitor lower or hidden objects that may be under a table or in a cabinet.

In this study, two autonomous vehicles are used to perform the security patrolling task and each of them is equipped with a camera, as shown in Figure 2.1, though the proposed methods are general for any number of vehicles. Because the autonomous vehicles suffer from accumulation of mechanical errors, two cameras with fish-eye lenses, called top-view omni-cameras in the sequel, are installed on the ceiling. By the two cameras, autonomous vehicles can be located and controlled to navigate along correct paths. Between all on-board equipments and the user, some control and communication tools are required. The entire system configuration including hardware equipment and software are described in Section 2.2.

Before all autonomous vehicles carry out the security patrolling task, a learning stage is necessary, in which the vehicles are taught where to go, what to do, and how to avoid collision with walls. The process to obtain all information that makes autonomous vehicles be able to accomplish the task assignment is described in

Section 2.3.

The phase in which autonomous vehicles carry out security patrolling is called the navigation phase in this study. In Section 2.4, we will describe the vehicle guidance principle and the process of performing the monitoring task in the navigation phase.

(a)

Figure 2.1 The vehicle used in this study is equipped with a camera. (a) A perspective view of the vehicle. (b) A front view of the vehicle. (c) A side view of the vehicle.

(b) (c)

Figure 2.1 The vehicle used in this study is equipped with a camera. (a) A perspective view of the vehicle. (b) A front view of the vehicle. (c) A side view of the vehicle. (continued)

2.2 System Configuration

In this study, the vehicle system used as a test bed is composed of a Pioneer 3-DX vehicle made by MobileRobots Inc., a WiBox made by Lantronix, and an Axis 207MW camera made by AXIS, as shown in Figure 2.2. The Axis 207MW camera, called the camera system, not only is one part of the vehicle system but also plays an important role of monitoring and locating vehicles. Because the whole system, called the control system, is controlled by users remotely, some wireless communication equipment is necessary. All the details of the above equipments are described in Section 2.2.1.

In order to develop the desired security surveillance system, we also need software that provides some commands and control interfaces. Besides, we also

provide an interface for users to control the vehicles and cameras. All the above utilities are described in Section 2.2.2.

(a)

(b) (c)

Figure 2.2The vehicle system used in this study. (a) A Pioneer 3-DX vehicle. (b) A WiBox. (c) An Axis 207MW camera.

2.2.1 Hardware configuration

The entire structure of the vehicle system used in this study is shown in Figure 2.3. There are three principal parts: vehicle system, camera system, and control system.

In the vehicle system, the Pioneer 3-DX vehicle is a 44cm×38cm×22cm aluminum body with two 19cm wheels and a caster. It can reach a speed of 1.6 meters per second on flat floors, and climb grades of 25^o and sills of 2.5cm. At slower speeds it can carry payloads up to 23 kg. The payloads include additional batteries and all accessories. By three 12V rechargeable lead-acid batteries, the vehicle can run 18-24 hours if the batteries are fully charged initially. A control system embedded in the vehicle makes the user’s commands able to control the vehicle to move forward or backward or to turn around. The system can also return some status parameters of the vehicle to the user.

To show the advantage of the mobile vehicle, a wireless connection between a user and the vehicle is necessary. A WiBox is used to communicate with the vehicle by RS-232, so the user has the ability of remotely controlling the vehicle over a network from anywhere.

In the camera system, an Axis 207MW camera has the dimension of 85×55×40mm (3.3”×2.2”×1.6”), not including the antenna, and the weight of 190g (0.42 lb), not including the power supply, as shown in Figure 2.4. The maximum resolution of images is up to 1280×1024 pixels. In our experiment, the resolution of 320×240 pixels is used by the camera fixed on the vehicle and that of 640×480 pixels is used by the one fixed on the ceiling. Both of their frame rates are up to 15 fps. By wireless networks (IEEE 802.11b and 802.11g), captured images can be transmitted to users at speeds up to 54 Mbit/s. Each camera used in this study is equipped with a

fish-eye lens that will expend the field of view.

Camera System Control System

Vehicle System Camera System

RS-232 Fixed on

Axis 207MW Camera

Access Point

Pioneer 3-DX Vehicle

WiBox

Computer

Figure 2.3Structure of proposed system.

In the control system, a notebook PC is used to integrate the entire security patrolling system. With access points, all status information from vehicles and

cameras can be delivered to the user by wireless networks. The PC produces some commands according to these data. By the same way, vehicles can receive the commands from the control system and perform corresponding actions. In other words, an access point is a communication medium among the three systems.

(a)

(b) (c)

Figure 2.4 The camera system used in this study. (a) A perspective view of the camera.

(b) A front view of the camera. (c) A left-side view of the camera.

2.2.2 Software configuration

ARIA (Advanced Robotics Interface Application) provided by MobileRobots, Inc.

is an API (application programming interface) that assists developers in communicating with the embedded system of the vehicle, either using a serial or TCP/IP connection. It is a powerful object-oriented toolkit and usable under Linux or Win32 OS in C++. Therefore, we use the Borland C++ builder as the development tool in our experiments to control the vehicles by ARIA. The lowest-level data and other information of the vehicle can also be retrieved easily by means of the ARIA interface.

About Axis 207MW camera controlling, the AXIS Company also provides a development tool called AXIS Media Control SDK. Using the Media Control ActiveX component from SDK, we can preview the image of the camera’s view and capture the current image data. It is also convenient for users to use it to develop any function with the images grabbed from the camera as input.

2.3 Learning Principle and Proposed Process

Because the patrolling environment is unknown, a learning strategy is necessary.

For the purpose of learning all knowledge that makes the vehicles accomplish the mission successfully, we develop a learning interface for users. The entire learning process is shown in Figure 2.5.

In this study, data having to be recorded are camera-related, object-related, and

area-related ones. The camera-related data are obtained from a camera calibration process. In this study, we don’t use the traditional camera calibration method to find a projective matrix for coordinate transformation. Instead, some landmarks on a pattern are utilized to acquire corresponding points between 2-D image and 3-D global spaces.

For the camera fixed to ceilings in this system, the pattern is just the patrolling floor and the landmarks on it are just the corners of rectangular-shaped tiles. A user points some landmarks in the image by the user interface with a mouse, and corresponding points in the global space are calculated. Because each camera, used in our system, is equipped with a fish-eye lens, images captured by them are warped. Therefore, we use a bilinear interpolation method to translate coordinates in images into global space by these corresponding points.

The object-related data are used to teach vehicles where to go and which direction to face when they perform the patrolling task. We drive a vehicle to the position where the vehicle can observe the monitored object and then record it as a monitoring point (MP) according to the image of a top-view omni-camera. For the purpose of learning the direction with respect to the object, we control the vehicle to face the object and let it move forward for a short distance. By the two positions of the vehicle (nodes), the direction angle can be obtained.

The area-related data are about the environment where the vehicles patrol. An assumption made in this system is that the floor shape of the environment is composed of rectangular regions. At first, a user must key in corners in the clockwise order manually, and then all rectangular regions will be obtained. There might exist some pairs of MPs not belonging to an identical rectangular region, between which a vehicle cannot move straightly. Therefore, some points, called turning points, are necessary and they can be obtained by processing all the rectangular regions. With these turning points, the distances of all pairs of MPs can be calculated and which

turning points between two MPs are passed by can also be recorded.

Start of Learning

Figure 2.5Flowchart of proposed learning process.

After all the data are obtained, they are saved into some text files. These files are then used in the navigation phase more than once.

2.4 Vehicle Guidance Principle and Proposed Process

When the learning job has been done, all vehicles can start to perform the security patrolling task. The entire guidance process proposed in this study is shown in Figure 2.6.

At first, the system reads all files that are obtained from the learning phase and contain information about the environment, autonomous vehicles, and monitored objects. According to the distances between all pairs of MPs, this system then plans random paths for each autonomous vehicle. If all differences between the paths of two vehicles do not exceed a threshold value which ensures the loads of all autonomous vehicles being balanced, the security patrolling task can be carried out.

Because autonomous vehicles suffer from accumulation of mechanical errors, we need to locate them constantly. When a vehicle runs a fixed length of distance, it must be located by the top-view omni-cameras. By the values of the vehicles’ odometers, this system calculates the centroid of each vehicle from an image captured by a top-view omni-camera. The other function of the camera is to monitor vehicles to see whether they are still under control. If any vehicle loses control of its action, the system will stop all vehicles and send an alarm message to the user. Otherwise, the odometer of the vehicle is corrected and then the vehicle proceeds to move to its goal node.

While the vehicles are carrying out the security patrolling, there could be collisions between vehicles. Therefore, the detection of collisions is necessary. This system computes the distance between two vehicles in every cycle of a fixed time duration and determines if they are too close. If true, the paths of the vehicles will be

Figure 2.6Flowchart of proposed navigation process.

A mission for the autonomous vehicles in this system is to take pictures of all the monitored objects during the navigation process. As a vehicle goes to a MP, it means that the vehicle will be in front of a monitored object. Therefore, the direction of the vehicle must be adjusted to face the object. Then the camera equipped on the vehicle takes a picture at the moment. The picture is transmitted to the control system by the wireless network and saved into an image file. When all the vehicles have accomplished their own patrolling paths, one cycle of security patrolling is finished.

Then, the system will plan another set of new random paths for all the autonomous vehicles again.

Chapter 3 Learning Strategies for Navigation by Semi-automatic Driving

3.1 Ideas of Proposed Techniques Used in Learning

In this study, two cameras with fish-eye lenses fixed on ceilings are utilized to locate and monitor all autonomous vehicles. Before the use of the cameras, they must be calibrated. For this purpose, we propose in this study a point-correspondence technique integrated with an image interpolation method without conducting the conventional task of calculating the projection matrix for transforming points between 2-D image and 3-D global spaces. At first, by a mouse a user points out some landmarks in an image of a calibration target which is selected to be the tile pattern on the floor of our experimental environment. The landmarks we use in this study are the crossing points of the grid formed by the tile pattern. Such crossing points for use as corresponding points are abundant which yield better calibration accuracy in the proposed point-correspondence technique for camera calibration. The detail is described in Section 3.2.

In an environment where autonomous vehicles navigate, it is indispensable to use some turning points in the navigation path to ensure no collision between the vehicles and the walls. To compute the turning points, the corner points of the walkable area are first utilized to acquire all rectangular regions within the entire area. Each region

is then represented by its upper-left and lower-right points. With these points, the system can judge whether the vehicles can move straightly between any pair of nodes where the vehicles visit (including the turning points). If two nodes belong to different regions, the vehicle will be guided to pass some turning points. In other words, the turning point is a medium that enables the vehicles to navigate between any pair of nodes without incurring collisions with the walls. Therefore, a turning point is selected to be the intersection of the centerlines of two overlapping regions or the center of the overlapping boundary of two adjacent ones. The details of the proposed techniques about processing rectangular regions and computing turning points are described in Section 3.3.1.

Additionally, to take the pictures of monitored objects by cameras equipped on the autonomous vehicles, all nodes and directions with respect to the objects must be recorded. In this study, a learning technique is proposed to guide vision-based vehicles to capture pictures at suitable spots and directions. Two top-view omni-cameras are used. The process is described in detail in Section 3.3.2.

3.2 Calibration of Top-View

Omni-Cameras with Fish-Eye Lenses

Each camera used in this study is equipped with a fish-eye lens. All images captured by the camera are warped. So the traditional camera calibration method of obtaining a global-space point via a projection-based transformation cannot be utilized directly; the cameras must be calibrated by another method, as mentioned

previously. For this, we propose a point-correspondence technique integrated with an image interpolation method. By the way, it is noted that the correct coordinates in the global space can be obtained from a warped image directly, as done in this study.

3.2.1 Review of Conventional Camera Calibration Technique

In general, a projection matrix is utilized in conventional methods to do the job of camera calibration. There are two kinds of parameters in the matrix, which must be calibrated, namely, the intrinsic and the extrinsic parameters. The intrinsic parameters do not depend on the position and orientation of a camera in the global space and include the focal length f, the image center point (u0, v0), the aspect ratio (Sx, Sy), and the skew error θs of the camera. Because the coordinate system of a camera and the global space may not be the same, the extrinsic parameters related to the rotation angle θ and the translation vector ( , , )t t t of the camera must be _x _y _z calibrated.

Based on the intrinsic and extrinsic parameters, the relation between points in 2-D image and 3-D global spaces may be described by Eq. (3.1) below [23], where the point (u, v)^T is in the image coordinate system and the point (x, y, z)^T is in the global coordinate system:

0 0

transformation transformation matrix of matrix of

1 intrinsic parameters extrinsic parameters 1

3.2.2 Proposed Calibration Technique

In the proposed calibration technique, we divide the patrolling area on the floor of the environment into multiple grids at first and all corners of them are called reference points. These points are the crossing points of the boundaries of the rectangular-shaped tiles on the floor. For every reference point, both of its coordinates in an image and in the global space must be recorded, describing a pair of corresponding points between the image and the global spaces.

In order to acquire more corresponding points faster, we calculate all quadratic curves in the image of the patrolling floor area, of which the intersections are exactly the desired reference points. Note that because the images are captured by the cameras equipped with fish-eye lenses, a straight line in the global space appears as a quadratic curve in the image. Therefore, the technique is feasible. A quadratic curve can be calculated by three points at least. This property is utilized to find all curves.

More specifically, we use a minimum mean-square-error (MMSE) method to calculate all the curves. Assume that a curve L is to be computed, which includes three parameters a, b, c. If points (x1, y1), (x2, y2), ..., (xn, yn) belong to the curve L, we may acquire n curves which can be represented as a matrix in the form of Aw bJK K=

, as

Through a series of simplifications from Eq. (3.3), we can acquire a result as

(A A w A b^T )ˆ = ^TK

Therefore, the curve L may be computed to be

Before calculating a curve, a user has to input the index of the curve and indicate that the curve is horizontal or vertical. As a curve is obtained, the pixels passed by the curve must record the index of the curve. The process of acquiring all quadratic curves is described as Algorithm 3.1 below. An example is shown in Figure 3.1 which is a result of acquiring some horizontal and vertical quadratic curves by the algorithm and each curve is calculated by four points on it.

Algorithm 3.1 Calculating a quadratic curve.

Input: An image I, the number n of points needed to calculate a curve, the index k of the curve, and being horizontal or vertical for the curve.

Output: Pixels passed by the quadratic curve.

Steps:

Step 1. Point out n points on the curve in the image I.

Step 2. Calculate the curve by the MMSE criterion as described previously.

Step 3. Record index k and the property of being horizontal or vertical into the table of pixels passed by the curve.

Figure 3.1Calculating quadratic curves.

When all curves are obtained, we check all pixels in the image. If a pixel is passed both by a vertical curve and by a horizontal one, the pixel is taken to be one of the reference points. The width and height of a grid in the global space are also taken

在文檔中有全方位監控之多輛電腦視覺自動車的最佳安全巡邏之研究 (頁 19-0)