• 沒有找到結果。

Chapter 4 Cast-Shadow Detection in Traffic Image

4.3 Comparison Results

For traffic monitoring and surveillance applications, shadow suppression prevents misclassification and erroneous counting of moving vehicles. The goal of shadow suppression is to minimize the false negatives (

FN , the shadow pixels misclassified as

S background/foreground) and the false positives (

FP , the background/foreground pixels

S misclassified as shadow pixels). In order to systematically evaluate the performance of the proposed method, we adopted two metrics, namely the shadow detection rate η and the object detection rate ξ [36] for quantitative comparison:

S

where

TP (resp.

S

TP ) is the number of shadow (resp. foreground) pixels correctly

F identified, and

FN (resp.

S

FN ) is the number of shadow (resp. foreground) pixels falsely

F identified. A comparison with existing methods has been carried out to validate the performance of the proposed algorithm. A statistical nonparametric (SNP) approach [72] and a deterministic nonmodel-based (DNM) approach [38] were selected for comparison. The SNP

56

approach treated object colors as a reflectance model from the Lambertian hypothesis. The work used the normalized distortion of the brightness

α

i and the distortion of the chrominance

C D

i′, computed from the difference between the background color of a pixel and its value in the current image, to classify a pixel in four categories:

Foreground :

C D

i′>

τ

CD or

α

i′<

τ

αlo else

The DNM approach works in the HSV color space. Shadow detection is determined according to the following equation:

 frame k) respectively.

To achieve an objective comparison, the model for the background image is updated in advance for all algorithms in the test. Figure 4-4 shows the test results of different methods (Proposed, DNM [38] and SNP [72]) from a benchmark sequences Highway-I [36]. First, we check the computation time of each algorithm for efficiency comparison, as shown in Table 4-2. The proposed algorithm requires the least computation time. It is reduced to 3.6%-21%

of the other two methods because it merely utilizes division operation to obtain the shadow (4.5)

57

(a) (b)

(c) (d)

Fig. 4-4. Comparison results of shadow suppression method (red pixels represent the moving vehicle;

the blue pixels represent the attached shadow). (a) Original image. (b) The proposed method. (c) The SNP method. (d) The DNM method.

Table 4-2 Computation Time for Shadow Spectral Analysis

Algorithm Proposed SNP DNM

Time (msec) 3.4 93 15.6

Specification:

Image size= 352x240,

CPU type= Intel Pentium 4 2.4GHz, RAM=448MB.

information. The SNP algorithm takes the longest time because of its complex normalization, which consists of square, square root, addition, and division operations. Second, we examine the shadow detection rate and the object detection rate of each algorithm. The ground truth for each frame is necessary for calculating the quantitative metrics of (4.3) and (4.4). We manually and accurately classified the pixels into foreground, background, and shadow

58

regions in the image sequences. Figures 4-5 and 4-6 show the comparison results of shadow detection rate and object detection rate between the proposed method, SNP, and DNM. The mean values of accuracy corresponding to the plots in Figs. 4-5 and 4-6 are reported in Table 4-3. The results evaluated by Prati et al. [36] (by analyzing tens of frames for each video sequence representative of different situations) are also listed in Table 4-3. The experimental results demonstrate that the proposed method generally provides more reliable object detection results compared with other state-of-the-art methods. The results obtained from algorithms with or without spatial analysis are also listed in the table for checking the effectiveness of spatial analysis. One observes that as expected, spatial analysis improves the performance of shadow detection. For instance, the shadow detection rate of the DNM algorithm is increased from 57% to 80%. The merit of spatial analysis is that it can combine with other existing shadow suppression methods for further improvement of performance.

The proposed method outperforms the other two methods in shadow suppression and moving object detection because it uses the ratio model, which is constructed from only a single image frame. The video clip of the experimental results can be found at:

http://isci.cn.nctu.edu.tw/video/JCTai/shadow1.wmv.

Fig. 4-5. Comparison result of shadow detection rate between the proposed method, DNM, and SNP.

59

Fig. 4-6. Comparison result of object detection rate between the proposed method, DNM, and SNP.

Table 4-3 The Accuracy of Detection Results.

Method Shadow

detection rate*1 Object detection

rate*1 Shadow

detection rate*2 Object detection rate*2 Proposed &Spa. 77.5% 72.2% 76.86% 80.52%

Proposed 62.9% 60.7% 71.97% 70.14%

SNP 72.6% 49.9% 74.48%

(81.59%*3)

59.39%

(63.76*4)

DNM 66.4% 56.07% 67.94%

(69.72%*3)

68.57%, (76.93%*4)

*1: The mean accuracy of detection results obtained by analyzing 300 frames for each video sequence.

*2: The mean accuracy of detection results obtained by analyzing tens of frames for each video sequence representative of different situations.

*3: Results from [36].

*4: Results from [36], the false positives belonging to the background were not considered in the computation of the object detection rate.

4.4 Summary

In this chapter, a shadow-region-based statistical nonparametric method has been developed to construct a ratio model for shadow detection of all pixels in an image frame.

60

Based on the Lambertian assumption, RGB ratios between lit pixels and shadowed pixels can be treated as a constant in image sequences. This assumption leads us to the development of a novel ratio model for detecting shadow pixels in traffic imagery. The proposed approach does not require many image sequences to construct the model. Instead, the model can be easily built using a shadow region in a single image frame. To increase the accuracy of shadow detection, two types of spatial analysis are proposed to verify the actual shadow pixels.

61

Chapter 5

Vehicle Detection and Tracking for Traffic Parameter Estimation

5.1 Introduction

Recently, image tracking has become an important technique for traffic monitoring and surveillance applications. Many algorithms based on image tracking have been developed for real-time traffic parameter estimation. After the moving vehicles have been segmented, the process of traffic parameter estimation consists of three main stages: vehicle detection, vehicle tracking, and traffic parameter estimation. Most existing algorithms used a special-design region lying in each lane to detect and track entering vehicles. The main drawback of these methods is that only similarly sized vehicles passing through a particular region of the road can be detected and tracked. If the size of a vehicle differs from the predefined one or if the vehicle does not pass through the particular region, it will not be detected and tracked by these existing algorithms.

In this chapter, we propose an image-based traffic monitoring system that automatically detects and tracks multiple different-sized vehicles that travel in any portion of a multi-lane road. We adopted active contour models to represent vehicle contours in an image frame. A method based on image measurement is developed to predict initial positions and sizes of vehicles for image contour generation. This method features simultaneous detection of

62

multiple vehicles that travel in any portion of the road. Kalman filtering techniques are then applied for active-contour-based image tracking of various vehicles. Analyzing the detection and tracking results allows us to estimate traffic parameters such as traffic flow rate, vehicle speed, and traffic density. Moreover, by using an optical flow method to obtain the traveling direction of vehicles, the detection method can also be used for estimating the vehicle turn ratio at an intersection.

The rest of this chapter is organized as follows. Section 5.2 gives an overview of the system for traffic parameter estimation. An image measurement algorithm for active contour representation will be described. Section 5.3 presents the proposed image tracking method and its application in ITMS. Section 5.4 describes the turn ratio estimation based on optical flow measurement. Experimental results of traffic parameter estimation using the proposed method will be presented in Section 5.5. Section 5.6 summarizes the contribution of this work.

5.2 System Overview of Image Tracking

Figure 5-1 shows the system architecture of the proposed contour initialization and tracking system. This image tracking system consists of three parts: the image processing module, the contour initialization module and the vehicle tracking module. The image processing module captures the traffic scene, segments the background image, removes the shadow, and extracts the binary image of moving vehicles from the image sequence. The contour initialization module uses a detection window to generate an initial vehicle contour for tracking operation. Once initialized, the vehicle contour will be tracked and updated iteratively. The active contour model is exploited to represent vehicle contours in this design.

The vehicle tracking module employs a dynamic model to predict the vehicle contour from its historical states. The contour of targeted vehicle is iteratively obtained by using image

63

measurement and Kalman filtering.

5.2.1 Active Contour Model

Active contour modeling is a powerful tool for model-based image segmentation [6]

[73]-[74]. Based on the active contour concept, an image measurement method for obtaining the best-fit curve of a vehicle contour for image tracking is presented below. In this work,

B-spline functions are adopted to represent vehicle contours in image frames. The vehicle

contour (x(s),y(s)) is represented using

N

B B-spline functions:

coordinates of the control points of the B-spline curve, so that

64

where

I denotes a 2x2 identity matrix,

2 ⊗is the Kronecker product denotation of two matrices and 0′is(0,0,0,0,0,0,0,0)T. In this design, a control point vector Q containing 8 control points is used to represent the vehicle contour, as shown in Fig. 5-2. The control points are indicated by circles in the figure.

Fig. 5-1. Vehicle tracking system architecture.

65

Fig. 5-2. Active contour of a vehicle.

5.2.2 Shape Space Transformation

In traffic image sequences, the contour of a moving vehicle will vary due to projective effects of camera’s view angle. Within a reasonable view angle, it can be assumed that the variation of vehicle contours is linear in traffic imagery. The vehicle contour can then be described by a shape-space planar affine transformation in the image plane. The boundary curve r(s)of each vehicle is expressed using a template curve

r

0(

s

) [75]: affine-matrix comprising one rotation and three deformation (horizontal, vertical, and diagonal) elements. Subtracting

r

0(

s

) from (5.4), one obtains:

66

where 1′is (1,1,1,1,1,1,1,1)T , Q0x, Q0y are X-Y coordinates of the control points of the template curve

Q and shape-space vector is

0

( u

x

u

y

M M M M )

T

Comparing (5.5) and (5.6), one obtains a linear transformation:

Q

0

WX

Q

= + . (5.7)

Using (5.7), one can transform a vehicle contour to a shape-space vector X. This simplifies the post-processing of contour tracking in the image plane and the vehicle contour will be restricted to vary steadily by the shape-space vector.

5.2.3 Image Measurement

An image measurement procedure is responsible for obtaining the best-fit curve of the vehicle outline in an image according to a predicted vehicle contour generated from a predicted shape-space vectorX~

, a template curve

Q , and its shape matrix

0 W. The binary image of a traveling vehicle is segmented from traffic imagery exploited a background removal operation (see Chapter 3). The contour featurerf(s) is obtained by applying one-dimensional (1-D) image processing along the normal direction of a predicted curve

) (s

r

[76]. Curve-fitting method of the detected features is employed to obtain the best-fit curve of the vehicle contour. In carrying out the curve fitting of contour features, one has to increase the tolerance for image disturbance and eliminate possible interference from features of other objects in the background. A contour shape-space vector X~

and a regularization constant

α

are used to stand for the relative effect of the shape in the curve fitting and

67

meet the criteria mentioned above.

Introducing the concept of information matrix

S and information weight sum

i

Z , the

i algorithm for finding the best fitting curve can be summarized as follows:

1) Select N regularly equal-spaced samples

s ,

i i=1,2,3....N and

s

1 =

h

,

s

i+1 =

s

i +

h

, 4) The aggregated observation vector is

Z

N

Z

= with the associated statistical information

S

=

S

N 5) The best-fit curve is expressed as a shape-space vector [76]

Z S S X

X = ~+( + )1 , (5.12)

68

where I B s I B s ds W

W N

S N T T T

B

T 1 B( ( ) ) ( ( ) ) )

( ∫0 22

=

α

.

5.3 Image Tracking and Traffic Parameter Estimation

In our design, the image-based traffic monitoring procedure includes four steps:

foreground segmentation, contour initialization, vehicle tracking, and traffic parameter extraction. The first step segments moving vehicles from the image sequence. The contour initialization step detects the moving vehicle and generates an initial contour for tracking. In the vehicle tracking step, targeted vehicles are tracked using a specially designed Kalman filter. The final step extracts the traffic parameters by using the tracking results.

5.3.1 Initial Contour Generation

To track multiple vehicles that are of various sizes and that might travel in any portion of a multi-lane road, we propose a contour initialization algorithm to generate initial contours for image tracking by using a special-designed detection window. The concept of the detection window is shown in Fig. 5-3. Depending on current vehicle imagery, there can be multiple detection regions and initialization regions in the detection window, as shown in Fig.

5-3. In the beginning, the entire detection window is the detection region. The system works to check whether any vehicle enters the detection region. When a vehicle is detected, the related detection region will change into an initialization region. The rest of the detection region remains unchanged. If the detected vehicle leaves the initialization region, this region will be released and become a detection region again. The detection region and the initialization region are automatically adjusted according to the current traffic imagery.

69

Fig. 5-3. A detection window consists of initialization regions and detection regions.

5.3.1.1 Image Processing in Detection Region and Initialization Region

To facilitate vehicle detection, the detection region is divided into several

1-pixel-width sub-regions. A sub-region is termed as an occupied sub-region if a foreground object exists in this sub-region. When the front part of vehicle enters the detection window, a cluster of occupied sub-regions will appear. The width of the vehicle image can be found from the front part of the vehicle image. As the related detection region contains enough occupied sub-regions, it will change into an initialization region, as mentioned in the previous paragraph. Fig. 5-4 shows a test example of a car and a motorcycle in the detection window.

Both the vehicle and the motorcycle are detected; two initialization regions are automatically generated in this case.

5.3.1.2 Initial Contour Generation

The final stage of contour initialization is the generation of an initial contour as the vehicle leaves the detection window. Fig. 5-5 depicts the concept of initial contour generation.

First, an estimated contour is automatically generated by using the geometric information of the initialization region. The width of the estimated contour is the width w of initialization region. The length

L of the estimated contour is assigned to

bw, ℜ is an empirical ratio

70

of the length to the width of vehicle in the captured image frame. The location of the estimated contour is assigned at the exit of the initialization region, as shown in Fig. 5-5. It is clear that the estimated contour is generated via simple geometric relationships in the initialization region; it may not match the actual situation of the vehicle image. Next, the estimated contour will be refined by using image measurement, as described below.

The dimension and location of the estimated contour are adjusted by using the actual vehicle image. The length

L and the location of the estimated contour are possibly different

b from the actual ones and need to be corrected. The width w is previously estimated by the detection window when the vehicle entered the detection window; it is not necessary to adjust the width of estimated contour. We project the captured binary image into two one-dimensional arrays and use the projection to measure the occupancy of a vehicle image.

The projection values are the sum of vehicle pixels along the vertical and horizontal

Fig. 5-4. The detection of a car and a motorcycle.

71

Fig. 5-5. Generation of an initial contour.

directions, respectively. Fig. 5-6 illustrates an example. In this case, the vertical projection reveals that the corner point Pt of initial contour should shift to the point P3, while the horizontal projection reveals that the length

L should be adjusted to

b

L , as shown in Fig.

a1 5-5. The control points and center point of the initial contour is then generated using w ,

L ,

a

P

3, and initialization region. It is employed to calculate the template

Q and the shape-space

0 vector X for image tracking:

72

Fig. 5-6. The projection profile of an estimated contour.

5.3.2 Kalman Filtering for Tracking

The vehicle contour is represented by a shape-space vector X with six elements. The first two elements of Xare position coordinates of the template curve and the remaining elements are shape-scaling elements, as described in Section 5.2.2. The vehicle tracking module employs two dynamic models to predict the horizontal and vertical positions from their historical position states. As for the shape-scaling elements, because the change of vehicle contour is very small within two consecutive frames, it is not necessary to employ complex dynamic models to predict the shape-scaling elements. The predicted states of these elements are simply assigned using the previously measured state Xobtained from the image measurement, as described in Section 5.2.3. The predicted states are provided to the information fusion stage for tracking a vehicle.

73

Below is the design of the dynamic position model for contour prediction. The state of horizontal or vertical position can be governed by:

k

0 . The observation model is

k space description of linear stochastic system. A Kalman filter is designed to combine the information from the predicted states and the best-fit states obtained from (5.12) [77]. Fig. 5-7 shows the block diagram of the image tracking system. The tracking procedure over one time-step is summarized as follows:

1) Predict the state ˆ , 1

2) Determine the error covariance ahead,

T

3) Compute the Kalman gain,

1

4) Use a Kalman filter to obtain the information fusion:

74

ˆ ) ˆ (

ˆk,k =

X

k,k1+

K v

k

C X

k,k1

X

, (5.19)

where

v

kis the measured state.

5) Update the error covariance

1 , ,k =( − ) kk

k I KC P

P ; (5.20)

then go to step 1 for the next iteration.

5.3.3 Traffic Parameter Estimation

As vehicles in an image sequence are successfully tracked, traffic parameters such as traffic flow rate, vehicle speeds, and traffic density can be obtained through simple calculation.

The traffic flow rate can be calculated by the ratio of detected vehicle numbers to the elapsed time. Traffic density D (car/km) is calculated as follows:

s

D= q, (5.21)

where q is the flow rate (car/hr) and s is the average travel speed (km/hr).

Vehicle speed can be obtained from two recorded positions of vehicle image and the elapsed time between these two positions. The center of the bottom edge of a vehicle image was taken as the reference point of the vehicle’s position. This point can be easily obtained from the control vectors of vehicle contour.

From the tracking result of a tracked vehicle, the reference point Pa (tracking operation is initialized) at time ta and Pb (as the vehicle attains a predefined region, tracking operation terminates) at time tb are recorded. Using (B.6) and (B.7), positions

Pa and Pb are transformed to world coordinates to calculate the traveling distance L between Pa and Pb. The vehicle speed vel can be calculated by

75

Fig. 5-7. Block diagram of the vehicle tracking system.

) (tb ta vel L

= − . (5.22)

5.4 Turn Ratio Measurement

For estimating the vehicle turn ratio, two detection windows are set up to detect oncoming vehicles turning left and moving straight, respectively. The driving direction of vehicles that travel in the left-turn detection window, for instance, consists of three different types of vehicles – moving right, moving straight ahead, and moving left, as depicted in Fig.

5-8. A rectangular area that covers the left-turn detection window is placed in the lower right region as shown in the figure. An oncoming vehicle that is turning left travels to the right in the rectangular area of Fig. 5-8(a). Two vehicles moving straight pass through the rectangular area of Fig. 5-8(b). Their driving direction is straight ahead. In Fig. 5-8(c), two vehicles

76

traveling straight ahead in the cross-lane direction travel to the left in the rectangular area.

Only the oncoming vehicles turning left must be monitored in this case. As noted above, the motion vector estimation module is designed to measure and classify the driving direction of vehicles in the detection window based on an optical flow measurement. The method of motion vector estimation will be described in Section 5.4.1.

The counting of vehicles by using image processing procedures described in the previous section facilitates the calculation of turn ratio. In this design, the turn ratio at an intersection is measured automatically. For instance, left-turn ratio is defined as

turn ratio =

total left

N

N , (5.23)

where Ntotal is the total count of detected vehicles in a time period and

N is the count of

left

where Ntotal is the total count of detected vehicles in a time period and

N is the count of

left