• 沒有找到結果。

Chapter 1 Introduction

2.5 Summary

`A novel algorithm has been proposed for automatic calibration of a PTZ camera overlooking a traffic scene. The proposed approach requires no manual operation to select positions of special features. It automatically uses a set of parallel lane markings and the lane width to compute the camera parameters, namely, focal length, tilt angle, and pan angle.

Image processing procedures have been developed for automatically finding parallel lane markings. Subsequently, the pan and tilt angles of the camera can be obtained by using the estimated focal length. To locate the parallel lane markings, an image processing procedure has been developed. Synthetic data and actual traffic imagery have been employed to validate accuracy and robustness of the propose method.

32

Table 2-4 Calibration Results under Different Camera Pose Settings

33

Chapter 3

Background Generation and Foreground Segmentation

3.1 Introduction

An image is considered a background image if there are no moving objects in the image.

Hence, one can obtain a background image by merely capturing an image with no moving objects. However, in a heavily trafficked road, it is difficult to find a traffic scene with no moving vehicles. Researchers have devoted themselves to developing methods for extracting the background image from traffic image sequences. Most methods exploit the concept of probability for background segmentation; examples are the GMM-based method and the histogram method. However, several directions deserve further study for improving the quality of background segmentation. For instance, the GMM-based method is sensitive to slow moving object, which will be misrecognized as a background object. In contrast, the histogram is sensitive to the sensing noise. Sensing noise degrades the quality of background segmentation. As an alternative to these two methods, a background estimation method using a single Gaussian scheme is presented in this chapter. A group-based histogram (GBH) algorithm is proposed to build the Gaussian background model of each pixel from traffic image sequences. The method is efficient and uninterrupted by both slow moving vehicles and sensor noise. Accordingly, moving vehicles can be segmented by background removal.

The rest of this chapter is organized as follows. Section 3.2 describes the GBH

34

algorithm for estimating the single Gaussian background model. Experimental results of the proposed method and several interesting examples of traffic parameter estimation are presented in Section 3.3. Section 3.4 concludes the contributions of this work.

3.2 Group-Based Histogram

In traffic imagery, the intensity of the background scene is the most frequently recorded intensity at its pixel position. The background intensity can therefore be determined by analyzing the intensity histogram. However, sensing variation and noise from image acquisition devices may result in erroneous estimation and cause a foreground object to have the maximum intensity frequency in the histogram. To solve this problem, we propose the GBH to estimate the single Gaussian model of static background. In the first stage, the group-based frequency of each intensity level is generated from the incoming intensity of an image sequence. The intensity level that has the maximum group-based frequency is treated as the mean of the single Gaussian background model. The standard deviation of the Gaussian model can be computed by using the estimated mean and the histogram. The detail procedure is explained in the following paragraphs.

3.2.1 Single Gaussian Background Modeling and GBH

In GBH, a cumulative frequency is generated from the frequency of each individual intensity level as well as those of its neighboring levels in the histogram. In other words, the frequency of each intensity level and the frequency of its neighboring levels are summed to form the group-based histogram. First note that the frequency or probability of a conventional histogram is updated by using a single intensity, while the probability of GMM is constructed from a group of intensities. Thus, the GMM is more suited than a simple histogram for representing intensity distribution of the background image. GBH possesses similar merits to GMM because it takes the variation of pixel intensity into account. This operation effectively handles the problem of sensor noises. Further, the GBH algorithm gives a reliable Gaussian

35

background model by using several image frames.

The GBH can be generated by using an average filter of width of 2w+1 where w is number of intensity levels. In GBH, the maximum probability density

p of a pixel at

u*,v location ( vu, )over the recorded image frames can be expressed as

* is smaller than a certain value, the maximum peak of the GBH will be located at a position closer to the center of a Gaussian model than the original histogram. This is because the filter smooths the histogram curve. Thus, the intensity that has the maximum frequency in the GBH can be treated as the mean intensity µu,v of the background model:

A smaller window width can save computation time for building up the GBH, while a larger width can produce a smoother GBH. For further discussion on the determination of window width, an example given below employs 13 Gaussians that were generated by using a Gaussian random number generator. The means of all Gaussians were chosen to be 205 and the standard deviations varied from 3 to 15. The histogram and the GBH generated with different widths were used to estimate the mean of each Gaussian; the error rates are shown in

36

Table 3-1. The error rates that fall within ±2% are highlighted in the table. The results show that the estimation of the proposed method is superior to that of a conventional histogram.

One can conclude from the simulation results that a larger window width of an average filter will be needed for high-accuracy performance as the standard deviation increases.

Keeping the error rate of mean estimation within ±2%, and using the simulation result, the width can be determined as follows:

 

In the following derivation of the Gaussian model, C(i) is used for recording the GBH frequency of intensity i. The mean intensity µu,v can be obtained from the maximum-value counter. The system does not process all counters when a new intensity l is captured, because the new intensity only increases the adjacent counters of counter

l . The proposed

Table 3-1 Estimation Error Rate of Gaussian Mean using Histogram and GBH.

Standard

deviation 3 4 5 6 7 8 9 10 11 12 13 14 15

-1.5% -1.5% -2.0% -2.4% -2.4% -2.9% -3.4% -2.9% -2.9% -4.4% -2.9% -4.9% -4.4%

width

w

Estimation result of the proposed GBH

1 0.5% 1.0% 0.0% -0.5% -0.5% 2.0% -3.4% -2.4% -2.9% -1.5% -3.4% 0.5% 1.0%

37

algorithm for obtaining the mean of Gaussian model is summarized below.

step 1. Record the current intensity l of a pixel.

After the center of the background model is found, the variance can be computed as follows:

where σ′ is the maximum standard deviation of the Gaussian background models (the value can be experimentally obtained by analyzing the background models from image sequences offline).

Fig. 3-1(a) shows an example of intensity histogram of a pixel in a traffic image sequence. It is clear from the figure that the background intensity level is distributed in the range from 195 to 215. Further analysis of the sampled data in the background-intensity region was performed by using MINITAB Statistical Software (release 13.32 for WINDOWS;

Minitab, Inc, State College, PA). The result shows that the data can be modeled as a Gaussian [67] with a mean and standard deviation of 203.65 and 3.88, respectively. However, one

38

(a) (b) Fig. 3-1. Statistical analysis of pixel intensity. (a) Histogram. (b) Group-based histogram of Fig.

3-1(a).

cannot determine the center of the background model from the histogram because three intensities have the same maximum frequency. On the contrary, the background intensity can be easily found in the group-based histogram (with width w = 3), as shown in Fig. 3-1(b). The mean and the standard deviation of the estimated Gaussian model of Fig. 3-1(b) are 205 and 4, respectively. The error rate of the mean and the standard deviation is 0.67% and 3.17%. The results confirm that the derived probability density function satisfactorily fits the background intensities.

Note that the proposed GBH only uses addition and comparison to estimate the mean of background pixels; on the contrary, other methods such as GMM involve much complex computation, such as multiplication, sorting, and division operations. Because estimation of the mean of background model is less time-consuming than the estimation of standard deviation, the mean intensity, the GBH, and the histogram are updated at every sample interval. The standard deviation is updated every 30 (or more) frames. To reduce the computation burden, the estimation of standard deviation is completed step-by-step in each sample interval. In traffic monitoring applications, the GBH, the histogram, and the background model are renewed every 15 minute to cope with illumination variations. The

39

computational load of the proposed method is significantly lower than the methods presented in [24] and [29]. Thus, it is more suitable for the real-time requirement of visual tracking of vehicles.

3.2.2 Foreground Segmentation

In this paragraph, we present the method for detecting moving foreground objects based on the background estimation results. The intensities of each pixel obtained from an image sequence generate a background Gaussian as well as many foreground distributions, as shown in Fig. 3-1(a). The historical intensities of each pixel can be divided into two groups: the static background and the moving foreground. If a foreground pixel appears, its intensity is then located outside the background Gaussian. To cope with the variation of the background intensity, a pixel will be classified as foreground if its current intensity is located outside

v Gaussian background model at location ( vu, ). An erosion operation is further employed to remove salt-and-pepper noise [64]. A traffic surveillance system then performs image measurement to monitor moving objects from the binary image.

3.3 Experimental Results

Several experiments have been carried out to validate the performance of the proposed method. A pixel level experiment is presented in Section 3.3.1 to demonstrate the effectiveness and robustness of the GBH approach. In Section 3.3.2, background images generated by using GMM and GBH are presented for performance comparison. Section 3.3.3

40

presents interesting experimental results of traffic parameter estimation.

3.3.1 Pixel Level Experiments

Fig. 3-2 shows the experimental results of background estimation of a pixel in a traffic image sequence as shown in Fig 3-2(b). The intensity was recorded over a fixed span of time (196 frames in 6.5 seconds). For comparison, a GMM with three Gaussian models was first constructed to model the pixel [29]. If a Gaussian matches the current pixel intensity

x at

t time

t , the mean c of Gaussian is updated by

t

whereβ is the learning rate. All weights are updated by ) to the probability of intensity based on past values. The normalized clusters are sorted in an order of decreasing weight. In this experiment, the Gaussian model that has the maximum weight is classified as the background Gaussian. The experimental result is shown in Fig.

3-2(a). In the figure, the solid line indicates the experimental data and the dashed line represents the estimation result of the GMM. In the experiment, the learning rate was selected as 0.1. One can see that the estimated mean matches with the background intensity well.

However, transient appearances of slow-moving objects easily change the weight and magnitude order of GMM background estimation. For example, from sequence 150 to 162, a transient slow-moving foreground object quickly results in a matched Gaussian model and intensity 110 is erroneously chosen as the background intensity for its maximum weight. This type of erroneous estimation is caused by the fact that in GMM the foreground object that

41

(a)

(b)

Fig. 3-2. (a) Traffic scene with a cross at the middle left showing the position of the sample pixel. (b) Background estimation of Fig. 3-2(a) using GMM-based approach and the GBH-based approach.

42

moves slowly or stops for a short period of time will cause its intensity cluster to obtain the maximum weight, even though its intensities do not have maximum probability in the historical data.

The proposed GBH method was then applied to the same data for finding the background intensity of the pixel. The dashed line of Fig. 3-2(a) represents the estimation result using the GBH method; the gray level of background pixel was maintained near 205 as expected. The GBH-based method succeeds to estimate an accurate mean of background model from image sequence. The proposed method is robust to the transient appearances of slow-moving objects.

3.3.2 Background Estimation of Traffic Imagery

To examine the performance of the proposed method, we utilized twelve image frames of the traffic scene to construct the background image. In the experiments, we first convert

RGB color information in image frames to Y luminance [68]:

B G

R

Y =0.299 +0.587 +0.114 . (3.9)

Fig. 3-3 shows the original image sequence. There are four rows in the figure arranged in order from left to right. The image in the upper-left corner is the first of the sequence. For comparison, both the GMM and the proposed GBH methods were employed to determine the background image of the traffic scene.

Fig. 3-4 depicts the background estimation results using GMM. A reliable background image is obtained as depicted in the figure. However, the transient appearance of slow moving vehicles degrades the quality of the background image constructed by GMM. For example, the squared area in the lower-right region is well constructed in the fifth image; however, the background image of the portion is unreliable in the later image frames, as depicted in the

43

Fig. 3-3. Image sequence for background image generation.

figure. This phenomenon is caused by a vehicle that slowly passed this region, as shown in Fig. 3-3. One can use a smaller learning rate to solve this problem; however, with a small learning rate, a reliable background image can only be obtained with much more computation.

Fig. 3-5 shows the estimation results using the proposed GBH method. The estimation quality of the GBH is practically the same as the one from the GMM, as shown in Fig. 3-5. As expected, the moving foreground vehicles almost disappear beginning with the fifth background image. Furthermore, the quality of the final background image is better than that from the GMM, especially in the lower-right region, as marked in Fig. 3-5.

The processing time of this experiment is listed in Table 3-2 for comparison. The computation time of the proposed GBH algorithm is considerably smaller than that of GMM (with a reduction rate of 35%). From these results, it is clear that the performance of GBH is more suitable than the GMM for image-based traffic surveillance applications.

44

Fig. 3-4. Background images constructed by GMM method.

Fig. 3-5. Background images constructed by the proposed method.

Table 3-2 CPU Time of the Tested Background Estimation Algorithms.

Algorithm GBH method Gaussian mixture model

Time(sec) 0.037 0.057

Specification: image size= 352x240, CPU type= AMD Athlon XP2000+, RAM=512MB.

45

3.3.3 Application to Traffic Flow Estimation

For investigating traffic congestion problems, the background segmentation module has been integrated into an ITMS for extracting traffic parameters. Our research team have established a real-time web video streaming system to monitor the traffic in Hsinchu Science Park [69]. The system provides an H.263 video stream to an ITMS. H.263 is suitable for digital video transmission over networks with a high compression and decompression ratio [70]. However, due to image compression and decompression, the degraded image is neither stable nor consistent for image processing of image-based applications. This phenomenon introduces extra challenges to image processing design.

In this study, video stream provided by the imaging system is employed to measure the traffic flow at a multi-lane entrance of the Science Park. The video stream in CIF format (352 x 288 pixels) is transmitted to the computer in the lab at a rate of 7 frames per second through ADSL. The system architecture of the real-time traffic monitoring system is shown in Fig. 3-6.

The image system consists of three parts: image processing module, vehicle detection module, and traffic parameter extraction module. The image processing module adopts an ActiveX component to decompress the actual image from the video stream and converts the image into gray-level format. Real-life images are employed to construct a background image. Based on

Fig. 3-6. Block diagram of the image-based traffic parameter extraction system.

46

the constructed background image, a binary image of the moving vehicle is obtained through foreground segmentation. The detection module employed a detection window that behaves like a loop detector to count the number of vehicles in a multi-lane road [71]. The detection window is able to check if there is a vehicle entering or leaving the window from the binary image. It simultaneously detects multiple vehicles from the binary image. Traffic parameter extraction module calculates the traffic flow data and provides information to the ITS center.

Fig. 3-7 illustrates a display of the system results. The upper left part of the figure shows a real-time image and the lower right part shows the background image created from the image sequence. The upper right part depicts the binary image of moving vehicles as well as the processed results of the detection window. The detected vehicles count and the traffic flow data are displayed on the lower left portion of the figure. The estimated traffic flow is calculated through a measure of vehicle count using the equation below:

traffic flow =

dur car

t

N , (3.10)

Fig. 3-7. The display of traffic flow estimation.

47

where

N is the detected count and

car

t is the time duration. A video clip of experimental

dur results can be found at: http://isci.cn.nctu.edu.tw/video/JCTai/Traffic_flow.wmv.

3.4 Summary

An algorithm of a group-based histogram is proposed to build the Gaussian background model of each pixel from traffic image sequences. This algorithm features improved robustness against transient-stop of foreground objects and sensing noise. Furthermore, the method features low computational load, and thus, meets the real-time requirements in many practical applications. The proposed method can be extended to construct a color background image to further increase the robustness during intensity analysis.

48

Chapter 4

Cast-Shadow Detection in Traffic Image

4.1 Introduction

Cast-shadow suppression is an essential prerequisite for image-based traffic monitoring.

Shadows of moving objects often cause serious errors in image analysis due to misclassification of shadows or moving objects. It is necessary to design a shadow suppression method to improve the accuracy of image analysis. Shadow detection can be divided into two types: shape-based approaches [33]-[34] and spectrum-based approaches [37]-[39]. In the shape-based method, sophisticated models are constructed to identify the shadow according to the object and its surrounding illumination conditions. The accuracy of shadow detection depends on the knowledge of environment conditions. Moreover, it can not meet the real-time requirement because the computation load of the spatial analysis is heavy.

in contrast, spectrum-based approaches exploit color space information to find the shadow.

The model is built by strenuously analyzing the shadows in image frames. Generation of the model is difficult and inefficient. Additionally, conversion between different color spaces requires much computation time and degrades the real-time performance. Based on the Lambertian assumption, RGB ratios between lit pixels and shadowed pixels can be treated as a constant in image sequences. This information leads us to the development of a RGB ratio model to detect shadow pixels in traffic imagery. The proposed approach does not require many image sequences to construct the model. Instead, the model can be easily built using a

49

shadow region in a single image frame. To increase the accuracy of shadow detection, two types of spatial analysis are proposed to verify the actual shadow pixels.

The following sections will focus on how the ITMS with shadow suppression is developed. Section 4.2 focuses on shadow detection. A comparison with existing methods is presented in Section 4.3. Section 4.4 gives our concluding remarks.

4.2 Cast-Shadow Detection in Traffic Image

The block diagram of the proposed shadow suppression method is presented in Fig. 5-1.

The complete system consists of four modules: a background estimation module (Chapter 3), a background removal module (Chapter 3), a shadow detection module, and a shadow

The complete system consists of four modules: a background estimation module (Chapter 3), a background removal module (Chapter 3), a shadow detection module, and a shadow