Principle of Escape - 利用電腦視覺自動車航行作室內安全巡邏及入侵人物偵測與追蹤

In this section, we describe the principle of escape for the vehicle. Before describing the method for escape of the vehicle, we have to decide what kind of situation in which the vehicle should escape. We define three states for the vehicle:

safe state, unsafe state and buffer state. When the vehicle is in an unsafe state, we command the vehicle to escape. If the vehicle is in a safe state, the vehicle will be commanded to move forward the target person. Else, we command it to keep conducting the human detection and tracking process but do not move forward or backward when it is in a buffer state. To define these three states, we introduce the concept of safe-distance: if someone is too close to the vehicle, we infer that he might be going to attack the vehicle. Imagine a circle on the xy-plane in the VCS with the vehicle as the center and a pre-defined distance r, the safe-distance, as the radius. If a stranger is within the circle, we say the vehicle is in an unsafe state; else, the vehicle is in a buffer state or a safe state. It means that if the distance between the vehicle and a person is less than the safe-distance which is defined in advance, we regard the vehicle to be in an unsafe state. Otherwise, we give a buffer between the safe and unsafe states. If a person stays in the buffer area, the vehicle will neither move forward nor move backward. To keep the vehicle in a safe state, as soon as we detect a stranger transgressed the safe-distance, we command the vehicle to escape. The process is shown in Figure 6.1.

To compute the distance between the detected stranger and the vehicle, we conduct the face detection process which is described in the previous chapter. We assume the range of the height of a human face to be between 20cm and 25cm and the height of a human body to be between 50cm and 60cm. Because we have this assumption of the height of the face, the height of the human body and the pre-defined

safe-distance, we can compute the distance between the person and the vehicle by angular mapping mentioned in Chapter 3. Then, we can determine whether the detected person transgresses the safe-distance or not.

Figure 6.1 The process of escape for the vehicle.

If the vehicle is in an unsafe state, we have to command the vehicle to escape.

However, there is only one camera equipped on the vehicle and the camera has to

environment to escape. Thus, we record the path the vehicle moved before as a solution to this problem. When the vehicle has to escape, it will be moved backward according to the recorded path. In an unsafe state for escape, the camera observes the stranger continually. When the target person is out of the safe distance, the vehicle will track the person again.

Figure 6.2 The safe-distance for the vehicle. The vehicle is in an unsafe state when a person detected within the green circle. Else if the person is in the yellow area, the vehicle is in a buffer state. Else, it is in a safe state.

6.3 Distance Computation from The Vehicle to A Stranger

We use the face region of the clothes region of the person to compute the distance between the person and the vehicle and determine if the state of the vehicle is safe or

unsafe. However, we only have the angular information of the face from the image. We

need the height of the face to compute the distance. And we do not need the exact distance between the person and the vehicle to decide the state of the vehicle. We only need to know whether the distance is smaller or larger than the safe-distance we defined in advance. To compute the distance of the person to the vehicle, we need to make a few assumptions first. We assume the person is standing on the ground, the length of his/her face is between 20cm and 25cm and the length of his/her body is around 50cm to 60cm.

The following algorithm shows the details to compute the distance between the person and the vehicle.

(a)

(b)

Figure 6.3 The illustration of the distance between the person and the vehicle. (a) Distance computing using length of the face. (b) Distance computing using length of the clothes.

Algorithm 6.1 Calculation of the distance from the vehicle to the person by the face

region.

Input: The detected face region Rface in an image, where Rface has two pairs of coordinates in the ICS: (fleft, ftop) and (fright, fbottom) which represent the boundary box of Rface. The range of the length of a human face [C1, C2].

Output: The range [D0, D1] of the distance between the person and the vehicle.

Steps:

Step 1. Transform the coordinates ( , ) 2

Step 2. Referring to Figure 6.3(a) and assuming the distance of the person to be d, compute the value of h1 − h2 by

Algorithm 6.2 Calculation of the distance from the vehicle to the person by the

clothes region.

Input: The detected clothes region Rclothes in an image, where Rclothes has two pairs of coordinates in the ICS: (fleft, ftop) and (fright, fbottom) which represent the

boundary box of Rclothes. The range of the length of a human body [C1, C2].

Output: The range [D0, D1] of the distance between the person and the vehicle.

Steps:

Step 1. Transform the coordinates ( , ) 2

Step 2. Referring to Figure 6.3(b) and assuming the distance between the person and the vehicle to be d, compute the value of h1 − h2 by

6.4 Method for Adjustment of Camera Orientation for Human Monitoring

After deciding the state of the vehicle, if the vehicle is in an unsafe state, we command the vehicle to escape. After an escape command is given to the vehicle, the vehicle will move backward to the position in the last navigation cycle. Since the

camera is carried by the robot arm on the vehicle, the camera might not face the stranger because of the backward movement. To keep continuous monitoring of the stranger, we have to reset the orientation of the camera cycle after cycle. According the records of the motion of the vehicle, we have the turning angle θ from the records of the last movement of the vehicle. Moving backward means that the vehicle has to turn the angle of –θ for escape. Thus, as soon as the vehicle moves to the last position, we command the camera to turn the angle θ to keep the camera in the correct orientation for keeping in-view observation of the intruding person.

An experimental result is shown in Figure 6.4. At the cycle time t = 1, the vehicle detected a face in the grabbed image, and computed the distance of the person. The red box means the person transgressed the safe-distance we defined in advance. Thus, the vehicle moved backward to the last position at t = 2. The yellow box points out the detected face of the person who is at a further distance from the vehicle.

t=1

t=2

(a) (b)

Figure 6.4 The experimental result that the vehicle moves backward to the last position when the vehicle detected the fact that the person is too close. (a) The images grabbed by the camera equipped on the vehicle. (b) The images captured from a third person’s viewpoint.

Chapter 7 Experimental Results and Discussions

7.1 Experimental Results

We will show some experimental results of the proposed human detection and following system in this section. The user interface of the system is shown in Figure 7.1. All experiments of this study were conducted in our laboratory, Computer Vision Laboratory at the Department of Computer Science at National Chiao Tung University in Hsinchu, Taiwan. The proposed system has two modes: a detection mode and a tracking mode. The system will detect humans in acquired images in the detection mode. In the tracking mode, the system will track a target person by the feature of extracted clothes.

After a user presses the start button, the system will start monitoring the environment with a detection mode. In each cycle, the system determines the state of the vehicle (safe or unsafe) at first. To do this, the system needs to know if a person transgresses the safe-distance of the vehicle. And, a detected face region in the image is the only feature for computing the distance between the person and the vehicle.

Therefore, the system conducts face detection first in each cycle, both in a detection mode or a tracking mode. If a face is detected and the distance of the person is smaller than the safe-distance, the vehicle is in an unsafe state and the system will command the vehicle to move backward to the last position and then finishes the current cycle.

When the vehicle is in a safe state and the system is in a detection mode, the system conducts face detection. If a face is detected in the image, the system will extract the cloth region for tracking and change the detection mode to the tracking mode, and then finishes the current cycle, as shown in Figure 7.3. Else, if nothing is detected in the face detection mode, the system will conduct human body detection based on motion detection. If a human body is detected, the system will command the vehicle to move forward to the person, trying to get an image with a clear face region and then finishes the current cycle, as shown in Figure 7.2.

Otherwise, when the vehicle is in the tracking mode, which means the system already has the image of the clothes of the target person, the vehicle will track the target person using the intersection of the cloth image in each cycle. Until the system loses the target person, the system will change the tracking mode back to the detection mode. An experimental result is shown in Figure 7.4.

Figure 7.1 An interface of the experiment. The green box shows the image stream and the blue box shows the input image at this moment. The yellow box shows the difference image and the red box shows the output image.

t=1

t=2

t=3

(a) (b) (c)

Figure 7.2 An experimental result of human body detection in the proposed system.(a) The input image. (b) The difference image. (c) The output image.

(a) (b) Figure 7.3 An experimental result of face detection and the extraction of the cloth.

(a) The output image with a detected face region and the extracted cloth region by region growing. (b) The image of the extracted cloth.

t=1

t=2

t=3

t=4

t=5

(a) (b) Figure 7.4 An experimental result of human tracking using the intersection of the

cloth images. (a) The input image. (b) The output image.

7.2 Discussions

By analyzing the experimental results of navigation, some problems are identified as follows.

(1) The result of detecting the moving region by using blockwise frame differencing might become worse due to the condition of jammed environment. When the distance between a stationary object and the vehicle is too close, the relative movement will be amplified. If we slow down the speed of the vehicle, an erroneous judgment sometimes will occur.

(2) The color-based face detection is sensitive to the luminance of light in the environment. We can adjust the hue and the saturation of the camera manually in advance. But the change of the luminance is a factor we cannot predict in advance.

Although lighting in indoor environment is more stable than outside, an image still can be affected easily due to the diaphragm of the camera.

(3) The human tracking process by cloth color cannot provide a measure of the distance between the vehicle and the target person. It only can compute the angular position of the target person. The system computes the distance of a person by the face region detected. In other words, only in the case of having detected a person’s face can the camera calculate the distance.

Chapter 8 Conclusions and Suggestions for Future Works

8.1 Conclusions

Several techniques and strategies have been proposed in this study and integrated into an autonomous vehicle system for security patrolling in the indoor environments with human detection and following capabilities.

At first, a camera calibration by angular mapping is proposed. We calibrate the camera by a technique of angular mapping, which uses the concept of spherical coordinate system. Each point in the image is the projection result of a light ray onto the image sensor. The light ray can be described by a longitude angle and a latitude angle of the ray in the 3D world space. The angular mapping calibration technique using image analysis is used to compute the direction between the vehicle and a target.

According to these angles and the height of the camera, we can know the relative locations of targets in images.

Next, some human detection techniques are proposed for indoor environment, including face detection and body detection. A human face is detected by the use of color and shape features in images. We use an elliptic skin model in the YCbCr color space to identify skin color regions and adopt an ellipse shape to fit the face contour.

Besides, we propose a blockwise frame differencing method to extract moving objects in the image and decide if the moving object is similar to a human body.

Then, the human tracking techniques are proposed. After an intruding person is

detected, the system will remember his/her clothes and track him/her. We propose a cloth region intersection method to predict the motion of a person to track him/her. Also, we record all the motions of the target person, and compute accordingly a parameter for the motion prediction of the target person.

In addition, a vehicle escape method by safe-distance keeping is proposed. We designed a function for the vehicle to escape from offensive strangers by a technique of safe-distance keeping. From the coordinates of the detected face region in an image, we

compute the distance between the person and the vehicle. If the distance is smaller than a pre-defined safe-distance, the vehicle is commanded to escape by moving backward to the last position.

The experimental results shown in the previous chapter have revealed the feasibility of the proposed system.

8.2 Suggestions for Future Works

The proposed strategies and methods, as mentioned previously, have been implemented on a vehicle system with a robot arm. Several suggestions and related issues are worth further investigation in the future. We state them as follows.

(1) In this study, we proposed a skin color model assuming an environment with uniform lighting. Thus, adding a capability of skin color learning to adapt the system to the changes of environment lighting is a suggestion for future work.

(2) We use clothes colors as the only feature of the clothes. To improve the extraction of clothes, we suggest conducting clothes tracking by different features, such as texture and shape, to eliminate errors caused by the case where the clothes color is

(3) In this study, we only proposed techniques of human detection and tracking. Adding a face recognition capability to recognize specified persons can make the vehicle react differently with different people.

(4) Since the vehicle only has one camera on it, the escape paths of the vehicle can only follow the previous paths of the vehicle. We suggest adding an omni-directional camera to plan better escape paths of the vehicle more conveniently.

References

[1] A. Lipton, H. Fujiyoshi, and R. Patil, "Moving target classification and tracking from real-time video," in Proceedings of the IEEE Image Understanding

Workshop, pp. 129-136. 1998.

[2] J. Hwang, Y. Ooi and S. Ozawa, “A Visual Feedback Control System for tracking and Zooming a target,” in Proceedings of the International Conference on Industrial Electronics, Control & Instrumentation, San Diego, USA, vol. 2, pp.740-745 ; November 1992.

[3] P. Nordlund and T. Uhlin, “Closing the loop: detection and pursuit of a moving object by a moving observer,” Image and Vision Computing, vol. 14, no.4, pp.

265-275, May 1996.

[4] J.L. Barron, D.J. Fleet, S.S. Beauchemin and T.A. Burkitt, “Performance of optical flow techniques,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Champaign, pp. 236-242, June 1992.

[5] D. Murray and A. Basu, “Motion tracking with an active camera,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 16, no.5, pp.

449-459, May 1994.

[6] J. Odobez and P. Bouthemy, “Detection of multiple moving objects using multiscale MRF with camera motion compensation,” in Proceedings of the

International Conference of Image Processing, Austin, Texas,vol.2, pp. 257-261, November 1994.

[7] S. Araki, T. Matsuoaka, N. Yokoya and H. Takemura, “Realtime tracking of

Transaction on Information and Systems, vol. E83-D, no. 7, July 2000.

[8] A. Arsenio and J. Santos-Victor, “Robust visual tracking by an active observer,”

in Proceedings of the International Symposium on Intelligent Robot Systems, vol.

3, pp. 1342-1347, 1997.

[9] L. Zhao and C. Thorpe, “Stereo and Neural Network-based Pedestrian

Detection,” IEEE Transactions on Intelligent Transportation Systems, vol. 1, no.

3, pp. 148 -154, September 2000.

[10] M. Bertozzi, A. Broggi, P. Grisleri, T. Graf, and M. Meinecke, “Pedestrian Detection in Infrared Images,” in Proceedings of IEEE Intelligent Vehicles Symposium 2003, Columbus, USA, pp. 662-667, June 2003.

[11] M. Soriano, B. Martinkauppi, S. Huovinen, and M. Laaksonen, “Skin detection in video under changing illumination conditions,” in Proceedings of IEEE International Conference on Pattern Recognition, Barcelona, Spain, vol.1, pp.

839--842, 2000.

[12] P. Viola, M. J. Jones, and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” in Proceedings of IEEE International Conference on Computer Vision, Nice, France, pp. 734–741, October 2003.

[13] T. Kanade, A. Yoshida, K. Oda, H. Kano, and M.Tanaka, “A stereo machine for video rate dense depth mapping and its new applications,” in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, CA, pp. 109-202, June 1996.

[14] Y. Dai and Y. Nakano, “Face-texture model based SGLD and its application,”

Pattern Recognition, vol. 29, pp. 1007–1017, June 1996.

[15] P. Fieguth and D. Terzopoulos, “Color-based tracking of heads and other mobile objects at video frame rates,” in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp. 21-27,

1997.

[16] D. Ayers & M. Shah, “Recognizing human actions in a static room,” IEEE Workshop on Applications of Computer Vision, pp. 42-47, October 1998.

[17] D. Li, “Moving objects detection by block comparison,” in Proceedings of IEEE International Conference on Electronics, Circuits, and Systems, Beirut, Lebanon, vol. 1, pp. 341-344, 2000.

[18] B. Heisele and C. Wohler, “Motion-based recognition of pedestrians,” in Proceedings of International Conference on Pattern Recognition, Brisbane, Australia, vol. 2, pp. 1325-1330, August 1998.

[19] C. Papageorgiou, T. Evgeniou, and T. Poggio, “A trainable pedestrian detection system,” in Proceedings of IEEE International Conference on Intelligent Vehicles, Germany, pp. 241–246, October 1998.

[20] D. Chai, A. Bouzerdoum, “A Bayesian approach to skin color classification in YCbCr color space,” in Proceedings of Region Ten Conference, Kuala Lumpur, Malaysia, vol. 2, pp. 421-424, September 2000.

[21] D. Chai, K. N. Ngan, “Face segmentation using skin-color map in videophone applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol.9, no.4, pp.551-564, June 1999.

[22] J. Y. Lee and S. I. Yoo, “An elliptical boundary model for skin color detection,”

in Proceedings of International Conference on Imaging Science, Systems, and Technology, Las Vegas, USA, pp. 579–584, June 2002.

[23] Philippe COLANTONY, “Color Space Transformation,” 2004, Available online:

http://www.couleur.org/index.php?page=transformations

[24] BitJazz Inc., “Sheervideo: About: Synchromy, ” Available online:

http://www.bitjazz.com/sheervideo/about/synchromy.shtml

在文檔中利用電腦視覺自動車航行作室內安全巡邏及入侵人物偵測與追蹤 (頁 73-0)