Motivation of Study - 以天花板上多環場攝影機輔助自動車作室內安全監控之研究

Chapter 1 Introduction

1.1 Motivation of Study

As the technology is progressing nowadays, more and more robots emerge in many applications. An autonomous vehicle is an important and common form of robots. It can move and turn by the control of programs and can take images by cameras equipped on it to increase its abilities. It is convenient to use autonomous vehicles to substitute for human beings in many automation applications. For example, a vehicle may be utilized to patrol in an environment for a long time without a rest. In addition, the video which is taken by a camera can be recorded forever for later search for its contents of various interests. For example, if a certain object in a video-monitored house is stolen, the video can be watched to find out possibly the thief, providing a better evidence of the crime than just the memory of a person.

Using vehicles to patrol in indoor environments automatically is convenient and can save manpower. The images which are taken by the cameras on vehicles may be transmitted by wireless networks to a central surveillance center, so that a guard there can monitor the environment without going to the spots of events, and this is also safer for a guard to avoid conflicts with invaders. Additionally, it is useful for a vehicle to follow a suspicious person who breaks into an indoor environment under automatic video surveillance, and clearer images of the person’s behavior can be

taken by the cameras on the vehicle. It is desired to investigate possible problems raised in achieving the previously-mentioned goals and to offer solutions to them.

Possible problems include:

1. constructing the mapping tables of cameras automatically, so that the positions of vehicles, invading persons, concerned objects, etc. can be computed;

2. detection of concerned objects and humans from images acquired by cameras equipped on vehicles and/or affixed to walls or ceilings;

3. recording of invaders’ trajectories and computation of their walking speeds for later inspection for the security purpose.

We try to solve in this study all these problems for indoor environments with complicated object arrangements in space. But if the environment under surveillance is very large, we cannot monitor the entire environment by using just one omni-camera. So it is desired to use simultaneously several cameras affixed on ceilings to cover large environment areas. In order to achieve this goal, possible problems to be solved include:

1. calculating the relative positions of the omni-cameras whose fields of view (FOV’s) overlap;

2. calculating the rotation angles of the omni-cameras;

3. handling with the hand-off problem among multiple cameras.

By calculating the relative positions and rotation angles of the cameras, we can calculate the position of an object in images which are taken by the cameras. And when a person walks from one region covered by a camera to another region covered by another camera, the system should know which camera should be used to get the image of the person and where in the image should the person be localized. This is the

camera hand-off problem which we desire to deal with in this study.

Because omni-cameras are highly distorted and are affixed to the ceiling, they cannot monitor the whole environment clearly. On the other hand, autonomous vehicles are mobile and are suitable to remedy this shortness of cameras. Hence, we can utilize the vehicles to drive to the place where should be monitored to take clearer images of concerned objects or humans there as stronger evidences for crime investigations.

Another problem is that the FOV’s of omni-cameras are finite, and the cameras are expensive. If the indoor environment is very large, we will have to use many omni-cameras on the ceiling, as mentioned previously. But if we can utilize the cameras on vehicles to take images of the place which is out of the FOV’s of the cameras, we will not have to use a lot of omni-cameras.

Hence if we want to navigate a vehicle to some spot in the environment, we should calculate the position of the vehicle first, and then we can plan the path for a vehicle from its position to the spot. In most environments, there are a lot of obstacles in them, such as furniture and walls. In order to avoid collisions between vehicles and obstacles, we may gather the information of the environment first. The information may include the positions of still obstacles and open spaces where the vehicles can walk through. Afterward, we may integrate the information to construct an environment map for the purpose of convenience. In short, an environment map is used in obstacle avoidance and path planning in this study. If we want to drive a vehicle to a certain spot out of the FOV’s of the cameras, we should calculate the position and direction of the vehicle at any time and plan a path for the vehicle to drive to that spot. Possible problems in these applications include:

1. gathering environmental information and constructing an environment map;

3. path planning and avoidance of still and dynamic obstacles in the path for the vehicle to navigate to its destination.

As a summary, in this study it is desired to investigate solutions to various problems involved in the following topics of indoor autonomous vehicle navigation:

1. security patrolling in indoor environments by autonomous vehicles;

2. effective integration of the omni-cameras on the ceiling and the cameras on the vehicles.

3. following a suspicious person and taking clearer images of her/him by the cameras on vehicles;

4. using the cameras on vehicles to monitor spots which are out of the FOV’s of omni-cameras and take clearer images.

1.2 Survey on Related Studies

In the study, we will use multiple omni-cameras on the ceiling to locate the position of a vehicle, so the omni-cameras should be calibrated before being used.

Traditionally, the intrinsic and extrinsic parameters of the camera should be calculated in order to obtain a projection matrix for transforming points between 2-D image and 3-D global spaces [1, 2, 3]. Besides, a point-correspondence technique integrated with an image interpolation method have been proposed in recent years for object location estimation [4], but it will cause another problem, that is, the calibration data will change according to the environment where the cameras are used. In this study, we will propose a technique to solve this problem for the case of changing the height of the ceiling on which the cameras are affixed.

Autonomous vehicles in general suffer from mechanical errors, and many methods have been proposed to eliminate this kind of error. The geometric shapes of object boundaries [5, 6] or those labeled by users are utilized frequently [7, 8].

Furthermore, natural landmarks, such as house corners [9, 10] and the SIFT features of images [11], are also used to correct the position of a vehicle. In recent years, techniques of integrating laser range finders with conventional imaging devices have been proposed [12, 13]. Besides, when it is desired to find a specific object in the image, the method of color histogramming is often used [14].

The applications of autonomous vehicles emerge in many aspects, such as house cleaning robots, watchdog systems, automatic guides, etc. In Takeshita [15], a camera was equipped on the ceiling, and a user can control the vehicle to suck garbage on the ground by watching the images taken by the camera. In Yang [16], the vehicles were designed to patrol in an environment, too. He used the vectors of vehicles and obstacles to avoid collisions between them.

1.3 Overview of Proposed System

There are four main goals in this system. First, a vehicle should patrol automatically in an indoor environment whose information has been learned. Second, the vehicle should avoid static and dynamic obstacles, and third it should correct its position automatically. At last, the vehicle should follow an intruding person and take the images of the person.

In order to achieve these goals, the following steps should be done:

1. construct mapping tables for top-view cameras;

2. acquire environment information by top-view cameras;

3. detect obstacles on the path;

4. correct mechanic errors by top-view omni-cameras;

5. calculate the position of any intruder by top-view omni-cameras continually;

6. initiate the vehicle to follow the intruder and take images of him/her;

7. deal with the hand-off problem of the cameras.

Because we need to convert coordinates between the image coordinate system and the global coordinate system, we have to construct the mapping tables for the cameras we use first. Afterward, the coordinates between the multiple cameras can be transformed correctly. Because the vehicles patrol in the indoor environment, the environment information should be learned in advance. The information learned in this study includes the positions of obstacles and the open space in the environment where the vehicles can walk through. And the information will be used to build an environmental map.

When vehicles patrol in the environment whose information has been learned, the patrolling path can be checked to see if there are obstacles on the path. If so, the vehicles should avoid them automatically. Besides, the vehicles generally suffer from mechanical errors, so it needs to correct their positions and directions continuously, to avoid intolerable deviations from their correct path way.

When a person breaks into the environment, the position of a person will be calculated continually, and then the computer will give orders to guide a vehicle to follow the person. In order to expand the range of surveillance, several omni-cameras are used in the study, so the hand-off problem should be handled. The problem means briefly the need of identifying a person in an image acquired by a camera and passing the information to an image taken by another camera. Figure 1.1 shows a flowchart of

Figure 1.1 The flowchart of proposed system.

1.4 Contributions

Several contributions are made in this study, as described in the following:

1. A height adaptation method is proposed to construct the mapping tables for omni-cameras in order to make the cameras usable at different ceiling heights.

2. An integrated space mapping method is proposed to localize objects in real space using multiple fisheye cameras.

3. A method is proposed to solve the problem of camera hand-off in highly distorted images taken by fisheye cameras.

4. A method is proposed to gather distorted environment images taken by omni-cameras, and convert them into a flat map.

5. A method is proposed to correct dynamically errors of the position and direction of a vehicle caused by mechanical errors.

6. A method is proposed to avoid static and dynamic obstacles automatically in real

time on vehicle navigation paths.

7. A technique is proposed to calculate the position of a person according to rotational invariance property of omni-camera.

8. A method is proposed to predict the position of a person in a highly distorted image.

1.5 Thesis Organization

The remainder of this thesis is organized as follows. In Chapter 2, the hardware and processes of this system will be introduced. And in Chapter 3, the proposed method for constructing the mapping tables for fisheye cameras will be described.

In Chapter 4, the construction steps of an environmental map and the proposed method for obstacle avoidance are described, and four strategies of correcting the positions and directions of vehicles are also described. In Chapter 5, the method of finding specific partial regions in an image to compute the position of a person and the technique for prediction of the person’s next movement are described. The hand-off problem is solved in this chapter, too.

The experimental results of the study are shown in Chapter 6, and some discussions are also included. At last, conclusions and some suggestions for future works are given in Chapter 7.

Chapter 2 System Configuration

2.1 Introduction

The hardware and software which are used in this study will be introduced in this chapter. The hardware includes the autonomous vehicle we use and the fisheye cameras and wireless network equipments. The software includes the programs for the processes of gathering the information of an environment, constructing an environment map, avoiding obstacles when patrolling in an environment and calculating the position of a person automatically.

2.2 Hardware

The autonomous vehicle we use in this study is a Pioneer 3-DX vehicle made by MobileRobots Inc., and an Axis 207MW camera made by AXIS was equipped on the vehicle as shown in Figure 2.1. The Axis 207MW camera is shown in Figure 2.2.

The Pioneer 3-DX vehicle has a 44cm38cm22cm aluminum body with two 19cm wheels and a caster. It can reach a speed of 1.6 meters per second on flat floors, and climb grades of 25^o and sills of 2.5cm. At slower speeds it can carry payloads up to 23 kg. The payloads include additional batteries and all accessories. By three 12V

fully charged initially. A control system embedded in the vehicle makes the user’s commands able to control the vehicle to move forward or backward, or to turn around.

The system can also return some status parameters of the vehicle to the user.

The Axis 207MW camera has the dimension of 855540mm (3.3”2.2”1.6”), not including the antenna, and the weight of 190g (0.42 lb), not including the power supply, as shown in Figure 2.4. The maximum resolution of images is up to 12801024 pixels. In our experiment, the resolution of 320240 pixels is used for the camera fixed on the vehicle and that of 640480 pixels is used for the one affixed on the ceiling. Both of their frame rates are up to 15 fps. By wireless networks (IEEE 802.11b and 802.11g), captured images can be transmitted to users at speeds up to 54 Mbit/s. Each camera used in this study is equipped with a fish-eye lens that expands the field of view of a traditional lens in general.

(a) (b)

Figure 2.1 The vehicle used in this study is equipped with a camera. (a) A perspective view of the vehicle. (b) A front view of the vehicle.

(a) (b)

Figure 2.2 The camera system used in this study. (a) A perspective view of the camera.

(b) A front view of the camera.

The Axis 207MW cameras are fisheye cameras. They are also affixed on the ceiling and utilized as omni-cameras, as shown in Figure 2.3. A notebook is used as a central computer to control the processes and calculate needed parameters from the information gathered by the cameras.

Figure 2.3 An Axis 207MW camera is affixed on the ceiling.

Figure 2.4 A notebook is used as the central computer.

Communication between the hardware components mentioned above is via a wireless network, a WiBox made by Lantronix equipped on the vehicle, in order to deliver and receive the signals of the odometer as shown in figure 2.5.

(a) (b)

Figure 2.5 The wireless network equipments. (a) A wireless access point. (b) A WiBox made by Lantronix.

2.3 System Process

In the proposed process of constructing the mapping tables for omni-cameras, we calculate the relative positions and rotation angles between the cameras. Afterward, mapping tables are constructed automatically for every camera, and a point-correspondence technique integrated with an image interpolation method is used to calculate the position of any object appearing in the image.

In the proposed process of environment learning, the information is gathered by a region growing technique first. The information includes the position of obstacles and open spaces where a vehicle can drive through. Afterward, the positions will be converted into the global coordinate system, and an environment map will be constructed by composing the coordinates of all obstacles appearing in the vehicle navigation environment.

In the proposed process of security patrolling in an indoor environment, each vehicle is designed to avoid obstacles on the navigation path. If there are some obstacles on a path of the vehicle, the vehicle system will plan several turning points to form a new path for the vehicle to navigate safely. After the vehicle patrol for a while, it will diverge from its path because the vehicle suffers from mechanical errors.

In the proposed process of vehicle path correction, we calculate the position of a vehicle in the image coordinate system by the images taken by the omni-cameras affixed on the ceiling, and then convert the coordinates into global coordinates and modify accordingly the value of the odometer in the vehicle.

In the proposed process of person following, the position of an intruding person’s feet is calculated also by the images taken by omni-cameras on the ceiling, and then

system will calculate the relative position and the rotation angle between the vehicle and the person, and adjust accordingly the orientation and speed of the vehicle to achieve the goal of following the person.

The major processes of the system are summarized and listed below:

1. Construct mapping tables for every top-view cameras.

2. Acquire environmental information by top-view cameras and construct the environment map.

3. Correct mechanic errors continuously in each cycle.

4. Plan a path to avoid obstacles in the environment.

5. Detect, predict, and compute the position of any intruding person by top-view omni-cameras continuously.

6. Handle the camera hand-off problem to keep tracking any intruding person using a single camera at a time.

Chapter 3 Adaptive Space Mapping Method for Object Location Estimation Subject to Camera Height Changes

3.1 Ideas of Proposed Adaptive Space Mapping Method

In this study, we use multiple fish-eye cameras affixed on the ceiling to keep an indoor environment under surveillance. The cameras are utilized to locate and monitor the autonomous vehicle, and trace the track of any suspicious person when he/she comes into the environment. When using these omni-cameras, we want to know the conversion between the ICS (image coordinate system) and the GCS (global coordinate system). So we propose a space mapping method and construct a mapping table for converting the coordinates of the two coordinate systems.

Because the indoor environment under surveillance is unknown at first, we propose further in this study another space mapping method by which the cameras can be affixed to different ceiling heights for use, which we call height-adaptive space mapping method. Besides, multiple fish-eye cameras are used in the mean time to monitor an environment in this study, so calculating the relative positions and angles between every two cameras which have overlapping fields of view (FOV’s) is needed

and is done in this study. Finally, a point-correspondence technique integrated with an image interpolation method is used to convert the coordinates between the ICS and the GCS.

3.2 Construction of Mapping Table

In this section, we propose a method of constructing a basic mapping table for use at a certain ceiling height. The mapping table we use contains 1515 pairs of points, each pair including an image point and a corresponding space point. And the data of each pair of points in the table includes the coordinates (x1, y1) of the image point in the ICS and the coordinates (x₂, y₂) of the corresponding space point in the GCS. We use a calibration board which contains 15 vertical lines and 15 horizontal lines to help us constructing the mapping table.

First, we take an image of the calibration board, find the curves in the image by curve fitting, and calculate the intersection points of the lines in the image. Then, we

在文檔中以天花板上多環場攝影機輔助自動車作室內安全監控之研究 (頁 13-0)