In the process of human tracking, the vehicle tracks the target person by detecting the clothes of the target person consecutively. In the previous chapter, we describe how to estimate the location of the target person’s face in the image. Then, the system will extract the cloth region of the person to facilitate human tracking. The idea behind the tracking method is to make the target person always appear at the center of the image, which means that the head of the vehicle always aims at the target person and moves forward. After turning the head of the vehicle, the vehicle will move to the person for a constant distance. Figure 5.1 shows a cycle of the human tracking process.
Figure 5.1 A cycle of the human tracking process.
The proposed process of human tracking is described in the following algorithm.
Algorithm 5.1 Process of human tracking.
Input: The region of the detected human face Rface. Output: The commands for the vehicle.
Steps:
Step 1. Save the image Icloth of the clothes of the target person by a region growing technique described in the next section.
Step 2. Capture an image Icurrent with the camera.
Step 3. Set the missing counter Cm to be 0.
Step 4. Obtain the difference image of Icloth and Icurrent to compute the position of the target person. If the target person is checked to be missing, increase Cm
by 1. Else, reset Cm = 0.
Step 5. Command the vehicle to turn its head to aim at the target person and move forward.
Step 6. Check the number of the missing counter Cm: if it is below the threshold, go back to Step 2; else, end the process of human tracking. Because the missing counter represents the number of times that the target person is missing, give a threshold t, if Cm is larger than t, then we decide that the target person is lost and command the vehicle to stop tracking.
5.3 Extraction of Colors of Human Clothes
Since we already know the face region, we can infer that the body region is below the face region. We choose a point which belongs to the body region to be the start point for region growing, and give a square boundary for region growing. By region growing, we can get the image part and the location of the clothes. After cloth extraction, the vehicle can track the human by the clothes region in the image sequence.
According to the detected face region, we know the width and height of the face region in images. To infer the body region of the person from the face region, we reference the drawing of Vitruvian Man by Leonardo da Vinci, as shown in Figure 5.2.
According to Leonardo’s notes in the accompanying text, it was made as a study of the proportions of the human body as described in a treatise by Ancient Rome Architect Vitruvius, who wrote in Vitruvius De Architectura 3.1.3 that: “for measuring from the feet to the crown of the head, and then across the arms fully extended, we find the latter measure equal to the former; so that lines at right angles to each other, enclosing the figure, will form a square.” According to the body proportions defined by Vitruvius, the distance from the hairline to the bottom of the chin is one-tenth of a man’s height and the maximum width of the shoulders is a quarter of a man’s height. As shown in Figure 5.3, by the body proportions, if we know the region of the human face, we can infer the width of the human shoulder and the distance from the face to the body. Since the image captured with the camera is a geometric ratio projection, after we detect the face region in the image, we can approximately infer the region of his/her clothes. We use the center of the clothes
region to be the start point for region growing of the clothes region. And the red square is the boundary for region growing. The detail is presented in the following algorithm.
Algorithm 5.2 Computing start point and boundary square for region growing of
clothes.
Input: The detected face region Rface in the image, where Rface has two pairs of the coordinates (u, v) in the ICS: (fleft, ftop) and (fright, fbottom) which represent the Step 2. Computewidthshoulder heightface
5
Let widthB widthshoulder heightface 5
=2
= .
Step 3. Define the boundary box B for region growing as a square with
face the bottom of the face region to be the top of the boundary region where
2
B
In the above algorithm, we have the coordinates of the start point S and the boundary box B, so we can get the cloth image Icloth of the person by region growing.
Also we know the clothes region Rcloth in the image where the face is detected. Rcloth
has two pairs of the coordinates (u, v) in the ICS: (cleft, ctop) and (cright, cbottom) which represent the boundary box of Rcloth.
Figure 5.2 The drawing of Vitruvian Man by Leonardo da Vinci.
1/10
1/4
height
height 1/8
1/4
Figure 5.3 The body proportion according to Vitruvius.
5.4 Human Tracking by Motion Analysis of Human Clothes
In this section, we introduce the method for human tracking by human clothes.
To track a person, besides knowing the position of the target at the moment, it has to predict his motion at the same time because of the delay of the vehicle for moving. To predict the motion of the target person, we have to record the motion of the target person as the basis for prediction. The recording format will be described in Section 5.4.1. In Section 5.4.2, we will describe the method for motion detection by cloth region intersection.
5.4.1 Recording of Human Motion
Because the vehicle tries to make the target person always appear at the center of the image, the movement of the target person from the center to the present location in the image can be seen as the relative movement of the target person and the vehicle.
Therefore, we record the relative movement in the last cycle to be a reference for predicting the movement of the target person in this cycle. As we know the position, (ucurrent, vcurrent), of the person and the image center, (ucenter, vcenter), in the ICS, we can know the relative movement (umove, vmove) of the target person by
;
5.4.2 Motion Detection by Cloth Region Intersection
To detect the location of the target person in this cycle by clothes, we use a cloth intersection region to predict the direction of the target person. The method only computes the directional variation of the target person. The detail of the proposed clothes region intersection is described in the following algorithm.
Algorithm 5.3 Cloth region intersection.
Input: Cloth image Icloth, the initial region Rinitial which is the target clothes region.
Output: The current region Rcurrent of the person’s clothes in the image.
Steps:
Step 1. Capture an image Icurrent.
Step 2. Subtract Icurrent at Rinitial by Icloth pixel by pixel, and get a new image Iintersect
which is the intersection of the two images at the region Rintersect.
Step 3. Grow a new clothes region Rcurrent by region growing that the starting pixels are randomly chosen from Iintersect.
Step 4. Let Rinitial =Rcurrent, and repeat the steps.
5.5 Applications to Stranger Tracking and Person Following
5.5.1 Stranger Tracking
By combining the process of human detection in the previous chapters with the process of human tracking in this chapter, it can achieve the goal of tracking intruding persons in indoor environment for security patrolling.
After we detected a human face in the image, we extract the image of the clothes by the method mentioned in Section 5.3. And after extracting the clothes region, the vehicle is commanded to track the person who wears the same color of the cloth image until the target person disappears in the field of view. In that case, the vehicle will return to the detection mode which was described in Chapter 4.
5.5.2 Person Following
With a pre-learning strategy of the target person, the system can conduct a work of following a specified person, which we call person following. Unlike stranger tracking just described, the system learns the clothes of the target person by manual.
The user decides the start point and the boundary box in the image for region growing of the clothes. After learning the image part of the person’s clothes, the system will enter the human tracking process mentioned previously in this chapter.
Figure 5.4 The application for stranger tracking.
Clothes Learning Stratage
Human Tracking Module
Face detected
Person Following
Clothes image
Finish
Target person disappears
Figure 5.5 The application for person following.
5.5.3 Experimental Results
In this section, we show some experimental results of human tracking in Figure . After the vehicle detected a human face in an image and extracted the clothes of the
person, it started the process of human tracking. Figure shows the consecutive images which were taken by the camera equipped on the vehicle when the vehicle tracked the target person. The yellow box in the images represents the intersection of the cloth region of two consecutive images. According to the intersection region, the vehicle computes the present position of the person and turns its head to aim at the target person. In this way, the person appears at the center of the images all the time.
(a) (b)
(c) (d) Figure 5.6 The consecutive images which were taken by the camera equipped on the
vehicle when the vehicle tracked the target person. The order of the images is (a) through (j).
(e) (f)
(g) (h)
(i) (j) Figure 5.6 The consecutive images which were taken by the camera equipped on the
vehicle when the vehicle tracked the target person. The order of the images is (a) through (j). (continued)
Chapter 6
Escape of the Vehicle from Strangers by Safe-Distance Keeping
6.1 Overview
The mobility property of the vehicle makes corners in a house viewable by the camera on the vehicle. On the other hand, this property also makes the risk that the vehicle might be stolen easily by an intruding person.
To avoid attacks from strangers, we design a mechanism for the vehicle to escape from strangers. There are two stages of the escape process, detection of dangerous situations and path planning for escape. We will state the principle of escape, which includes the definition for dangerous situations and the path planning strategy to escape in Section 6.2.
We regard the vehicle to be safe if nobody appears in a pre-defined range of distances from the vehicle. Accordingly, we need to compute the distance between the vehicle and the person which is detected in the image. We will describe how we accomplish this work in Section 6.3. To keep the stranger in the field of view of the camera, we need a method to adjust the orientation of the camera. The proposed method is presented in Section 6.4.
6.2 Principle of Escape
In this section, we describe the principle of escape for the vehicle. Before describing the method for escape of the vehicle, we have to decide what kind of situation in which the vehicle should escape. We define three states for the vehicle:
safe state, unsafe state and buffer state. When the vehicle is in an unsafe state, we command the vehicle to escape. If the vehicle is in a safe state, the vehicle will be commanded to move forward the target person. Else, we command it to keep conducting the human detection and tracking process but do not move forward or backward when it is in a buffer state. To define these three states, we introduce the concept of safe-distance: if someone is too close to the vehicle, we infer that he might be going to attack the vehicle. Imagine a circle on the xy-plane in the VCS with the vehicle as the center and a pre-defined distance r, the safe-distance, as the radius. If a stranger is within the circle, we say the vehicle is in an unsafe state; else, the vehicle is in a buffer state or a safe state. It means that if the distance between the vehicle and a person is less than the safe-distance which is defined in advance, we regard the vehicle to be in an unsafe state. Otherwise, we give a buffer between the safe and unsafe states. If a person stays in the buffer area, the vehicle will neither move forward nor move backward. To keep the vehicle in a safe state, as soon as we detect a stranger transgressed the safe-distance, we command the vehicle to escape. The process is shown in Figure 6.1.
To compute the distance between the detected stranger and the vehicle, we conduct the face detection process which is described in the previous chapter. We assume the range of the height of a human face to be between 20cm and 25cm and the height of a human body to be between 50cm and 60cm. Because we have this assumption of the height of the face, the height of the human body and the pre-defined
safe-distance, we can compute the distance between the person and the vehicle by angular mapping mentioned in Chapter 3. Then, we can determine whether the detected person transgresses the safe-distance or not.
Figure 6.1 The process of escape for the vehicle.
If the vehicle is in an unsafe state, we have to command the vehicle to escape.
However, there is only one camera equipped on the vehicle and the camera has to
environment to escape. Thus, we record the path the vehicle moved before as a solution to this problem. When the vehicle has to escape, it will be moved backward according to the recorded path. In an unsafe state for escape, the camera observes the stranger continually. When the target person is out of the safe distance, the vehicle will track the person again.
Figure 6.2 The safe-distance for the vehicle. The vehicle is in an unsafe state when a person detected within the green circle. Else if the person is in the yellow area, the vehicle is in a buffer state. Else, it is in a safe state.
6.3 Distance Computation from The Vehicle to A Stranger
We use the face region of the clothes region of the person to compute the distance between the person and the vehicle and determine if the state of the vehicle is safe or
unsafe. However, we only have the angular information of the face from the image. We
need the height of the face to compute the distance. And we do not need the exact distance between the person and the vehicle to decide the state of the vehicle. We only need to know whether the distance is smaller or larger than the safe-distance we defined in advance. To compute the distance of the person to the vehicle, we need to make a few assumptions first. We assume the person is standing on the ground, the length of his/her face is between 20cm and 25cm and the length of his/her body is around 50cm to 60cm.
The following algorithm shows the details to compute the distance between the person and the vehicle.
(a)
(b)
Figure 6.3 The illustration of the distance between the person and the vehicle. (a) Distance computing using length of the face. (b) Distance computing using length of the clothes.
Algorithm 6.1 Calculation of the distance from the vehicle to the person by the face
region.
Input: The detected face region Rface in an image, where Rface has two pairs of coordinates in the ICS: (fleft, ftop) and (fright, fbottom) which represent the boundary box of Rface. The range of the length of a human face [C1, C2].
Output: The range [D0, D1] of the distance between the person and the vehicle.
Steps:
Step 1. Transform the coordinates ( , ) 2
Step 2. Referring to Figure 6.3(a) and assuming the distance of the person to be d, compute the value of h1 − h2 by
Algorithm 6.2 Calculation of the distance from the vehicle to the person by the
clothes region.
Input: The detected clothes region Rclothes in an image, where Rclothes has two pairs of coordinates in the ICS: (fleft, ftop) and (fright, fbottom) which represent the
boundary box of Rclothes. The range of the length of a human body [C1, C2].
Output: The range [D0, D1] of the distance between the person and the vehicle.
Steps:
Step 1. Transform the coordinates ( , ) 2
Step 2. Referring to Figure 6.3(b) and assuming the distance between the person and the vehicle to be d, compute the value of h1 − h2 by
6.4 Method for Adjustment of Camera Orientation for Human Monitoring
After deciding the state of the vehicle, if the vehicle is in an unsafe state, we command the vehicle to escape. After an escape command is given to the vehicle, the vehicle will move backward to the position in the last navigation cycle. Since the
camera is carried by the robot arm on the vehicle, the camera might not face the stranger because of the backward movement. To keep continuous monitoring of the stranger, we have to reset the orientation of the camera cycle after cycle. According the records of the motion of the vehicle, we have the turning angle θ from the records of the last movement of the vehicle. Moving backward means that the vehicle has to turn the angle of –θ for escape. Thus, as soon as the vehicle moves to the last position, we command the camera to turn the angle θ to keep the camera in the correct orientation for keeping in-view observation of the intruding person.
An experimental result is shown in Figure 6.4. At the cycle time t = 1, the vehicle detected a face in the grabbed image, and computed the distance of the person. The red box means the person transgressed the safe-distance we defined in advance. Thus, the vehicle moved backward to the last position at t = 2. The yellow box points out the detected face of the person who is at a further distance from the vehicle.
t=1
t=2
(a) (b)
Figure 6.4 The experimental result that the vehicle moves backward to the last position when the vehicle detected the fact that the person is too close. (a) The images grabbed by the camera equipped on the vehicle. (b) The images captured from a third person’s viewpoint.
Chapter 7
Experimental Results and Discussions
7.1 Experimental Results
We will show some experimental results of the proposed human detection and following system in this section. The user interface of the system is shown in Figure 7.1. All experiments of this study were conducted in our laboratory, Computer Vision Laboratory at the Department of Computer Science at National Chiao Tung University in Hsinchu, Taiwan. The proposed system has two modes: a detection mode and a tracking mode. The system will detect humans in acquired images in the detection mode. In the tracking mode, the system will track a target person by the feature of extracted clothes.
After a user presses the start button, the system will start monitoring the environment with a detection mode. In each cycle, the system determines the state of the vehicle (safe or unsafe) at first. To do this, the system needs to know if a person transgresses the safe-distance of the vehicle. And, a detected face region in the image is the only feature for computing the distance between the person and the vehicle.
Therefore, the system conducts face detection first in each cycle, both in a detection mode or a tracking mode. If a face is detected and the distance of the person is smaller
Therefore, the system conducts face detection first in each cycle, both in a detection mode or a tracking mode. If a face is detected and the distance of the person is smaller