Chapter 5 Monitoring of a Nearby Static Car around a Static Video
5.3 Distance Estimation of a Static Car
5.3.3 Calculation of Car Distance and Creation of Surround Map
To show the relative position of the video surveillance vehicle and the detected car on a surround map, it is required to get the 3D position information of the detected car. For this purpose, the 3D data of the corresponding point pairs is computed by the 3D data acquisition method described in Section 2.3. Let the height and the distance of each point pair be denoted as Hi and Di, respectively, and the total number of the corresponding point pairs be as n. The height Hcar and distance Dcar of the detected car is taken to be the mean value of all the values of Hi and Di of these points,
In addition, from each point pair we can obtain the azimuth angle with respect to the X-axis. And the azimuth angle c of the middle point pair among these corresponding points is selected to represent the azimuth angle car of the detected car.
With the horizontal distance Dcar and azimuth angle car, the relative position of the detected car can be described as the coordinates (ucar, vcar) in a top-view 2D coordinate system created for displaying the surround map, which may be computed in the following way:
uc = (Dcarcoscar)/ratio; vc = (Dcarsincar)/ratio, (5.13) where the value ratio is a scaling factor to scale the real WCS distance down into the top-view 2D coordinate system.
Once the 3D information of the detected car is obtained, we can generate the surround map from the top view as shown the example of Fig 5.10. A detailed
algorithm for doing this is given in the following.
c
Dcar
u v
(0, 0)
Figure 5.10 A surround map from the top view.
Algorithm 5.4 A method of generating the surround map.
Input: the position of a detected nearby car at top-view coordinates (uc, vc) computed by Eqs. (5.13).
Output: a top-view surround map of the vehicle environment including the video surveillance vehicle and the detected nearby car.
Steps.
Step 1. Initialize a background image I with all pixels colored in a gray color like that of the asphalt road.
Step 2. Paste a graphic model of the video surveillance vehicle at the center of image I.
Step 3. Select the front-right corner of the video surveillance vehicle model as the origin with coordinates (0, 0) of a top-view 2D coordinate system for displaying the desired surround map.
Step 4. Paste a graphic model of the nearby car on I at coordinates (uc, vc).
Step 5. Take the final I as the desired surround map.
Chapter 6
Monitoring of a Nearby Static or Moving Car with a Moving Video Surveillance Vehicle
6.1 Idea of Detection of Static or Moving Car in Omni-images
In Chapter 5, we have described the proposed method for detection and display of a nearby car. Both the detected car and the video surveillance vehicle are assumed to be static there. However, in this study we propose further a method to detect a nearby moving or static car while the video surveillance vehicle is being driven.
Optical flow analysis may be used again here to estimate the motion of an object in consecutively acquired omni-images, and if the concerned object is higher than the ground, its motion in the image will produce motion vectors with larger lengths. This property may be used to segment the car from the background. Moreover, we also analyze the color of the detected car by the k-means algorithm and use the color information to segment out the car region in the omni-image. Finally, the position of the detected car is estimated by a template matching method proposed in this study, and a surround map is generated accordingly.
As a summary, we may divide the proposed method into five major steps: (1) computing the motion vectors by optical flow analysis; (2) separating the car region from the non-car one roughly according to the motion vector lengths; (3) segmenting
out the car region by the color information; (4) estimating the position of the detected car; and (5) generating a surround map. A flowchart illustrating these major steps of the proposed method is shown in Figure 6.1. All of the above steps will be elaborately introduced in the following sections.
6.2 Moving Car Detection by Motion Vectors Generated by Optical Flow Analysis
6.2.1 Detection of Car Region by Motion Vector Lengths
To separate the car region from the non-car one in the consecutively acquired omni-images, we use optical flow analysis to produce the motion vectors and detect the car region by the motion vector lengths. The details of this process are described in order subsequently.
A. Block Based Processing
To monitor the surrounding environment of the video surveillance vehicle, we evenly select points in the omni-image to compute the motion vectors by optical flow analysis discussed previously. For the selected points to be evenly distributed in the omni-image, we divide the omni-image into equal-sized blocks and select the points to be the centers of the blocks. An example of the block-based omni-image is shown in Figure 6.2. The following process will take image block as the unit of processing.
Figure 6.2 An example of block-based omni-image the block region is the video surveillance vehicle roof that we ignore and the red points are the selected points.
B. Estimation and Transformation of Motion Vectors
We want to estimate the motions of objects in the surrounding environment from two consecutive omni-images. The process is divided into two stages: (1) estimation of the motion vectors by optical flow analysis; and (2) transformation of the motion vectors from the ICS into the WCS. The optical flow analysis method used in Stage 1 has been reviewed in Section 4.2.2.B. And the transformation process used in Stage 2 is identical to that described in Section 4.2.2.C.
C. Detection of Ground and Car Regions by Motion Vector Lengths
Because the driving speed of the video surveillance vehicle is not constant and the lengths of the motion vectors are roughly proportional to the car speed, we can use dynamic thresholding to separate the car region from the background, as shown by the example seen in Fig. 6.3. We use the standard deviation value of all the motion vector lengths to set the threshold value. Assume that the length of each motion vector is
The threshold values for detecting the car and the ground are respectively set to be as follows:
Lcar = L + Sn; (6.3)
Lground= L – Sn. (6.4)
To record the car region, we initialize a record image Ir which is of the same size
as that of the original omni-image. The car region to be put into the mage Ir is labeled by “1”; and the other regions, by “0.” Each motion vector produced by optical flow analysis is checked and compared with the threshold values Lcar and Lground. If the length of the motion vector is larger than Lcar, we regard the block yielding the vector as a region of the detected car and label it by “1” in image Ir; else, we label the block by “0.” Besides, we also define the threshold value Lground and use it in the following way: if the length of a motion vector is smaller than Lground, every pixel in the corresponding region is regarded as a ground point, and pushed into a buffer B for use in conducting a ground learning process described in the subsequent section.
In summary, we list the rules for car detection as follows:
, " ";
, " ."
n
n
if length of motion vector L S then label the region as car if length of motion vector L S then label the region as ground
. (6.5)
Figure 6.3 A result of separating the car region from the non-car region, where the red points are used to represent the car region and the green points to represent the non-car region.
6.2.2 Detection of Car Body by k-means Algorithm
In order to detect pixels of the car body in image Ir for region growing method,
we divide the process into two steps: (1) use the k-means algorithm to partition a set of feature points into three clusters (2) determine which cluster is the car body. The steps of the k-means algorithm are illustrated in Fig. 6.4, and a detailed algorithm implementing it is introduced as follows.
(a) (b) (c) (d)
Figure 6.4 An illustration of k-means algorithm. (a) The image of initialize the cluster centers. (b) The image of associating every data with the nearest mean. (c) The image of reassigning the cluster centers. (d) The result image of k-means algorithm.
Algorithm 6.1 Partitioning feature data into clusters.
Input: a set of feature points Di and a total number k of clusters.
Output: the center point Cj of each cluster and a set of input feature points labeled with cluster index.
Steps.
Step 1. Initialize k cluster centers randomly among the input data.
Step 2. Perform the following steps to the feature points until either the number of iterations reaches a pre-selected limit or the centers of clusters become stable (with no change in the positions of the cluster centers).
2.1 Calculate the distance between each feature point Di and each cluster center, and label Di with the index of the closest cluster center.
2.2 Update each cluster center Cj by calculating the mean value of the feature points which are labeled as j.
As a result, the k-means algorithm may be used to partition a set of feature points into k clusters, with each point belonging to the nearest cluster. To use the algorithm in this study, the RGB values of the center pixel of each block in image Ir are taken as the input feature points into the k-means algorithm, and k is taken to be 3, i.e., all input feature data are partitioned into three clusters, in which one is the car body of a certain color. Note that in this study, we assume that each car is of a single color.
Furthermore, the car region in image Ir consists of three possible types of objects
the body of the detected car, the windows of the car, and noise components.
Therefore, we have to find the cluster of the car body by analyzing the center of each cluster for the region growing method, and the region growing method will be introduced in the following section.
Moreover, the color of the car window is sometimes similar to that of the ground.
To avoid growing the ground, each cluster with its center’s gray value close to the ground value gm should be ignored. That is, if the gray value of a cluster center is close to the ground value gm, we will not conduct region growing with the cluster as the starting point, as described in Section 6.2.3.B. The computation of the value gm
will be introduced in Section 6.2.3.A. The algorithm of determining the cluster of the car body from those yielded by the above algorithm (Algorithm 6.1) is described in detail as follows.
Algorithm 6.2 Determination of the cluster of the car body.
Input: the center points Ci of the clusters Si found by Algorithm 6.1 with RGB values (Ri, Gi, Bi), i = 1, 2, 3; the number Ni of feature points in each cluster Si, and the gray value gm of the ground.
Output: a cluster Sj of feature points of the car body or none.
Step.
Step 1. Sort the numbers Ni of feature points of all the clusters Si, i = 1, 2, 3, and pick up the largest one with index j, namely, Nj, which is the number of feature points of cluster Sj.
Step 2. Compute the difference D between the gray value of the center Cj of cluster Sj and that of the ground, gm, as follows:
D = |(Rj+Gj+Bj)/3 gm|.
Step 3. If the difference D is larger than a threshold TH, output the cluster Sj with index j as the desired car body; else, ignore the cluster Sj and go to Step 1 to process the remaining cluster(s).
6.2.3 Detection of Car Region by Color Information
After finding out the feature-point cluster of the detected car body by the above two algorithms, we are able to detect the entire car region more completely in the image Ir produced in Section 6.2.1.C by eliminating the ground area and growing the region of the detected car body, as described in the following.
A. Elimination of the Ground Region
To decrease the probability of false alarms and erroneous detections, we have to learn the ground information by the use of a buffer B which collects the ground pixels found in Section 6.2.1.C and eliminate the ground regions from the image Ir. The
To eliminate the ground region, we scan all the blocks in image Ir which are labeled by “1” and apply the following classification rule:
m m
r
"if the difference value satisfies the following condition:
, ,
then mark the block as the ground region in image ; else, continue," eliminating the ground region, the detection of the car region in image Ir will be more accurate.
B. Detection of Car Region by Region Growing within a Color Tolerance
To make the detected car shape more complete, we want to select the points in the region of the detected car body and regard these points as seed points to grow the neighboring regions within a color tolerance. More specifically, after we use Algorithm 6.2 to find the feature points of a detected car body, we want use the points further to grow the entire car region. Each feature point in the cluster of the car body corresponds to a pixel in the image Ir, and we want to take the pixel as a seed point to grow the neighboring points under two conditions: (1) the difference of the color value between the seed point and the neighboring point is within a color tolerance;
and (2) the neighboring point is inside the growing range centered at the seed point. If the color value of the neighboring point is similar to the seed point, we will regard the neighboring point as belonging to the car region and label the block by “1” in the image Ir. The method of detecting the car region with a given color tolerance is described in Algorithm 6.3 below.
Algorithm 6.3 Detecting the car region by filling regions within a given color
tolerance.
Input: the record image Ir, the original image Io acquired with the omni-camera, and the output data (the cluster centers) of the k-means algorithm described in Algorithm 6.1 in Section 6.2.2.
Output: the bi-level image Ir with the car region labeled by “1.”
Step.
Step 1. Scan each input pixel p in Ir and check whether the label of p is the same as that of the cluster center of the car body or not. If the same, continue; else, scan the next input pixel in Ir.
Step 2. Regard the pixel p as a seed point and check the following classification rule for region growing by the color information:
"if a neighboring point of satisfies the following conditions :
( , ) - ( ', ') ( , ) ,
the corresponding block as belonging to the car region in the image ; else, continue,"Ir
(6.8)
where Io(u, v)c with c = r, g, and b denotes the currently-observed pixel’s c-color value of the seed point located at image coordinates (u, v), Io(u, v')c with c = r, g, and b denotes the c-color value of a neighboring pixel at image coordinate (u, v'), and the value diff is the color tolerance between the currently-observed pixel and one of its neighbor.
As shown in Fig. 6.5, after the region growing process by the color information using the above algorithm, we obtain a more accurately detected car shape from the omni-image.
(a) (b)
Figure 6.5 A result of region growing by the color information. (a) An image to show the result of the region growing, and the purple points represent the growing region. (b) The corresponding bi-level image of the image (a).
6.3 Updating of Car State
6.3.1 Estimation of Car Location by Rectangular-shaped Models
To estimate the location of a nearby car, we match the detected car region in the image Ir by a mask for estimation of the approximate car position. The method of the estimating the car location includes two stages: (1) generation of the car mask by transforming a rectangular-shaped car model from the WCS into the ICS; and (2) detection of the car location by a template matching scheme.
Stage 1. Generation of the car mask on the image plane
To generate the car mask on the omni-image plane for matching the detected car, we have to transform the car model in the WCS to the ICS as shown in Fig. 6.6.
Because the car shape is similar to a rectangle, we imagine a rectangular-shaped car
model on the X-Y plane and transform the model into the image plane for matching the detected car in order to locate the car. For this, in this study we create a series of rectangular-shaped car models besides the video surveillance vehicle under the premise that the detected car is driving around the video surveillance vehicle.
Moreover, in order to shorten the computation time of the matching process, we just select sparse points on each rectangular-shaped model to represent the model instead of transforming the whole rectangular-shaped model region for matching the detected car.
More specifically, we divide the rectangular-shaped model into small blocks and transform the center point of each block from the WCS into the ICS to create a car mask. We use buffers to record the points transformed from each model in the ICS for the further matching process. The algorithm for such a transformation from the WCS into the ICS is described as follows.
X
Algorithm 6.3 Coordinate transformation of a car model point from WCS to ICS.
Input: a point P of the rectangular-shaped model with world coordinates (Xi, Yi, Zi),
the origin of the mirror center Om at coordinates (X0, Y0, Z0), and the
1.4 Compute the horizontal distance d between the point P and the mirror center Om by the following formula:
2 2
Step 2. Find the corresponding point P at image coordinates (ui, vi) by looking up the pano-mapping table with the azimuth angle and the elevation angle .
After conducting the above process, we transform all the points in the rectangular-shaped car model in the real world space to the image plane and keep all the coordinates (ui, vi) in a buffer for matching the detected car. Furthermore, we use the center of the mask to represent the car location, and the center of the mask is
where (ui, vi) are the image coordinates of the i-th point in the mask and ni is the total number of points in the mask.
To conduct the matching process, we generate in advance a series of masks for use at different positions as shown in Fig. 6.7 for accelerating the matching process.
However, the point density of each mask is not exactly the same. The points of a far rectangular-shaped model, after being transformed from the WCS into the ICS, will result in a high-density mask and the points in the mask may overlap each other.
Accordingly, the spacing ds between two points of the s-th rectangular-shaped model is related to the distance between the video surveillance vehicle and the model, and we compute the value ds as follows:
ds = 10 + |s (nmodel/2)| 5, (6.11) where the nmodel is the total number of the points in the model.
(a) (b)
Figure 6.7 The result of mask in the omni-image. (a) The near mask with respect to the video surveillance vehicle. (b) The far mask with respect to the video surveillance vehicle.
Stage 2. Detection of the detected car location by template matching method
To locate the detected car in the omni-image, we use a template matching method to overlap the detected car with the mask. See Fig. 6.8 for an illustration. The
masks are generated in the previous stage, Stage 1, and we have to check each overlapping ratio between the mask points of the i-th mask and the detected car shape in the image Ir. The mask which results in the highest overlapping ratio value will be regarded the as the most suitable one for the detected car, and the center point of the mask at coordinates (ui, vi) is finally taken as the position of the detected car. The
masks are generated in the previous stage, Stage 1, and we have to check each overlapping ratio between the mask points of the i-th mask and the detected car shape in the image Ir. The mask which results in the highest overlapping ratio value will be regarded the as the most suitable one for the detected car, and the center point of the mask at coordinates (ui, vi) is finally taken as the position of the detected car. The