Background Model Construction Using HSV Model

Chapter 4 Experimental Result

4.1.1 Background Model Construction Using HSV Model

We built the background model in the HSV color space. The value of H or S or V is between 0 and 255. Figs. 4.3(a), 4.3(b), and 4.3(c) show the background image in the H, S, and V component, respectively. We can find from these three figures that the hue value is relatively unstable when the saturation is close to zero. We make an experiment to test the changes in the HSV components in constructing the background model. Fig. 4.4 represents the H, S, and V variations of two pixels at coordinates ( , )x y = (10, 10) and ( , )x y = (120, 160) during the first 300 frames in the background video. From Fig. 4.4, we can see that V component is most stable of the background model. H and S components are less stable than V.

(a) (b) (c)

Fig. 4.3 Background images. (a) Background image in the H component, (b) Background image in the S component, and (c) Background image in the V component.

(a) (b)

(e) (f)

Fig. 4.4 H, S, and V variations versus frame index of background video image frame 1 to frame 300. (a) H at (10, 10), (b) H at (120, 160), (c) S at (10, 10), (d) S at (120, 160), (e) V at (10, 10), and (f) V at (120, 160).

4.1.2 Foreground Subjects Extraction Using HSV Model

In segmenting the images, the V component is usually stable and reliable, but it has two drawbacks: the V component is insensitive to the similar, especially light, color such as yellow, pink, and light blue. When the subjects wear the clothing with the color different from the background, we can do background subtraction well in the V color component.

In the first step, we use the frame ration in the V color component to get the binary image B x y in Eq. (24) described in Sec. 3.1.4. The value ( , ) k is chosen _V by experiments and varies with different trials. Hence, we ran a series of experiments to determine the optimal threshold k When the subject’s clothing V. color is different from the background, Fig. 4.5 shows the binary image B x y ( , ) obtained by different kV 's. After the experiment, we set kV =1.5 in our system.

(a) (b)

(e) (f)

Fig. 4.5 An example of foreground extraction at different k thresholds. (a) An _V image frame with subject’s clothing color different from the background, (b)−(f) foreground detected images, (b) k_V =1.2, (c) k_V =1.3, (d) k_V =1.4, (e) k_V =1.5, and (f) kV =1.6.

During the foreground extraction, the shadowing effect introduces artifact foreground subjects and deteriorates the recognition result. We use the shadow mask, which including the shadow’s characteristic existing in HSV domains of Eq. (25) described in Sec. 3.1.4 to classify the pixels whether it is a shadow point or not. Fig.

4.6 shows the process result regarding shadow suppression. Figs. 4.6(a) and 4.6(b) are two input images. Figs. 4.6(c) and 4.6(d) are the foreground subject without shadow suppression. The foreground subject with shadow suppression is shown in Figs. 4.6(e) and 4.6(f), which improves greatly comparing with Figs. 4.6(c) and 4.6(d).

(a) (b)

(e) (f)

Fig. 4.6 The example of the shadow suppression.

4.1.3 Foreground Extraction with Lighting Change Compensation

During the foreground extraction, it is imperative that the background modeling should adapt to the lighting of the scene. Under the framework of lighting change detection scheme of Eqs. (20), (21), (22) and (23) described in Sec. 3.1.3, we detect the current images whether existing a considerate lighting change or not. Fig. 4.7 shows the process result concerning lighting change. Figs. 4.7(a), 4.7(d) and 4.7(g) are three input images. Figs. 4.7(b), 4.7(e) and 4.7(h) are the foreground subject without lighting change compensation. The foreground subject with lighting change compensation is shown in Figs. 4.7(c), 4.7(f) and 4.7(i), which improves greatly comparing with Figs. 4.7(b), 4.7(e) and 4.7(h).

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Fig. 4.7 The example concerning lighting change. (a) An image frames in which the subject “walking” with “normal” lighting, (b) a foreground subject without lighting change compensation from (a), and (c) a foreground subject with lighting change compensation from (a). (d) An image frames in which the subject “walking”

with “dark” lighting, (e) a foreground subject without lighting change compensation from (d), and (f) a foreground subject with lighting change compensation from (d).

(g) An image frames in which the subject “walking” with “bright” lighting, (h) a foreground subject without lighting change compensation from (g), and (i) a foreground subject with lighting change compensation from (g).

4.2 Object Extraction Using RGB Model

4.2.1 Background Model Construction Using RGB Model

We built the background model in the RGB color space. The value of R or G or B is between 0 and 255. Figs. 4.8(a), 4.8(b), and 4.8(c) show the background image in the R, G, and B component, respectively. We can find from these three figures that the R, G, and B are relatively stable, and Fig. 4.9 represents the R, G, and B variations of two pixels at coordinates ( , )x y = (10, 10) and ( , )x y = (120, 160) during the first 300 frames in the background video. From Fig. 4.9, we can see that R, G, and B components are stable of the background model. Hence, we utilize R, G, and B components to build background model using GMMs.

(a) (b) (c)

Fig. 4.8 Background images. (a) Background image in the R component, (b) Background image in the G component, and (c) Background image in the B component.

0 50 100 150 200 250 300

Fig. 4.9 R, G, and B variations versus frame index of background video image frame 1 to frame 300. (a) R at (10, 10), (b) R at (120, 160), (c) G at (10, 10), (d) G at (120, 160), (e) B at (10, 10), and (f) B at (120, 160).

4.2.2 Foreground Subjects Extraction Using RGB Model

The R, G and B color components are stable and reliable. So we utilized R, G, and B component to describe a background pixel by a GMMs instead of a signal model.

In the first step, the 3D space is projected onto the 2D space of R-G, G-B, and B-R to classify a pixel using the standard deviations to get the binary image

( , )

B x y in Eq. (27) described in Sec. 3.2.3. The value k is chosen by experiments G

and varies with different trials. Hence, we ran a series of experiments to determine the optimal threshold kV. Fig. 4.10 shows the binary image ( , )B x y got by different k with subject’s clothing color different from the background. After the G

experiment, we set k_G =1.5 in our system.

(a) (b)

(e) (f)

Fig. 4.10 An example of foreground extraction at different k thresholds. (a) An G

image frame with subject’s clothing color different from the background, (b)−(f) foreground detected images, (b) kG =1.2, (c) kG =1.3, (d) kG =1.4, (e) kG =1.5, and (f) k_G =1.6.

4.2.3 Foreground Extraction with Lighting Change Compensation

The environment with a large enough lighting change makes the foreground subjects distort and deteriorates the recognition result. We use the lighting change detection, compensation algorithm Eq. (22) described in Sec. 3.1.3, to determine the image frame pixel should be changed if these exists a large lighting change. Fig.

4.11 compares the results with/without lighting change compensation. Figs. 4.11(a), 4.11(d) and 4.11(g) are three input images. Figs. 4.11(b), 4.11(e) and 4.11(h) are the foreground subjects without lighting change compensation. Figs. 4.11(c), 4.11(f) and 4.11(i) are the foreground subjects with lighting change compensation. It is evident that foreground extraction with lighting change compensation can segment the subject correctly despite the big lighting change, while the subject is not detected, Fig. 4.11(e), for the sudden decrease in light intensity.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Fig. 4.11 The example concerning lighting change. (a) An image frames in which the subject “walking” with “normal” lighting, (b) a foreground subject without lighting change compensation from (a), and (c) a foreground subject with lighting change compensation from (a). (d) An image frames in which the subject “walking”

with “dark” lighting, (e) a foreground subject without lighting change compensation from (d), and (f) a foreground subject with lighting change compensation from (d).

4.2.4 Foreground Subjects Extraction with Moving History-Based Background Adaptation

When a moving subject stops for a while, it should be included into the background. We use the History-Based Background Adaptation in Eqs. (28) and (29) described in Sec. 3.2.3 to classify the pixels whether they become to stop of moving subjects. Fig. 4.12 shows the process result.

Fig. 4.12 contains twelve image frames in which the subject walks “from the left to the right,” and the other one subject walks “from the left to the center,” stops

“for a while” and then walks ”from the center to the right.” Figs. 4.12(a), 4.12(b) and 4.12(c) are two subjects walking. Figs. 4.12(d), 4.12(e), 4.12(f), 4.12(g) and 4.12(h) are one of the subject stops for a while, and it becomes gray. Figs. 4.12(i), 4.12(j), 4.12(k) and 4.12(l) are the subject begins walking.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

Fig. 4.12 The example of moving History-Based Background Adaptation.

4.3 The Extraction Rate of Object

We randomly selected 30 frames from the video sequence of the model with three kinds of lighting cases, and each frame with a subject wearing the clothing with the color different from the background. The “foreground subject ground truths” of these 30 frames were generated manually. Let A be a detected foreground subject region and B be the corresponding “ground truth.” Then we test the pixel accuracy rate [18] by

Accuracy rate A B 100%,

A B

= ∩ ×

∪ (30)

this measure counts the percentage of the mutual positive pixels to expanded positive pixels. Table I shows the accuracy rates in HSV color space of 30 frames, and demonstrates the improvement of lighting change compensation over that without lighting change compensation. Similar to the result shown in Fig. 4.7, the segmentation accuracy is greatly improved from the “dark” case. Table II shows the accuracy rates in RGB color space of 30 frames, and demonstrates the improvement of lighting change compensation over that without lighting change compensation.

Similar to the result shown in Fig. 4.11, the segmentation accuracy is greatly improved from the “dark” case.

TABLE I

COMPARISON RESULT OF THE PIXEL ACCURACY RATES OVER 30IMAGES IN THE HSVCOLOR SPACE

Pixel Accuracy Rates (HSV color space) Luminance

“Normal” lighting case 87.67% 87.86%

“Dark” lighting case 28.95% 81.09%

“Bright” lighting case 87.99% 89.28%

Average 68.20% 86.08%

TABLE II

COMPARISON RESULT OF THE PIXEL ACCURACY RATES OVER 30IMAGES IN THE RGBCOLOR SPACE

Pixel Accuracy Rates (RGB color space) Luminance

“Normal” lighting case 76.53% 77.43%

“Dark” lighting case 24.87% 70.63%

“Bright” lighting case 84.48% 84.99%

Average 61.96% 77.68%

Chapter 5 Conclusion

In this thesis, we have proposed an adaptive background modeling, either in the HSV or RGB color spaces, to improve the subject extraction. In these two color spaces, we can utilize not only the luminance component but also the chromatic component existent in the background image. The luminance change is detected to trigger the adaptation of our model to this change, and we can reliably extract the foreground subject. On the other hand, the statistics on subject moving/stopping status triggers our model’s adaptation to the moving changes. Experimental results have shown that we have obtained consistently excellent results in the foreground subject extraction despite the changes encountered.

Some subjects wearing light color clothing, e.g., pink, still cannot be extracted well, which deserves to be investigated further. In addition, recognition from a different viewing direction, extensions of various test environments, more complicated surrounding, and more complicated activity are our future work.

References

[1] I. Haritaoglu, D. Harwood, and L. S. Davis, “W⁴: real-time surveillance of people and their activities,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 809−830, 2000.

[2] C. Stauffer and W. E. L. Grimson, “Adaptive background mixture models for real-time tracking,” in IEEE Conference on Computer Vision and Pattern Recognition., vol. 2, June 1999.

[3] R. Cucchiara, C. Grana, M. Piccard, and A. Prati, “Improving shadow suppression in moving object detection with HSV color information,” in Proc.

IEEE Intelligent transportation System Conference, pp. 334−339, 2001.

[4] N. Friedman and S. Russell, “Image segmentation in video sequences: a probabilistic approach,” in Proc. Thirteenth Conf. Uncertainty in Artificial Intelligence, pp.175−181, Aug. 1997.

[5] S. Park and J. K. Aggarwal, “Segmentation and tracking of interacting human body parts under occlusion and shadowing,” in Proc. of the Workshop on Motion and Video Computing, pp.105−111, 2002.

[6] M. K. Leung and Y. H. Yang, “First sight: a human-body outline labeling system,” IEEE Trans. Pattern Anal. Machine Intell., vol. 17, no. 4, pp.

359−377,1995.

[7] S. Jabri, Z. Duric, H. Wechsler, and A. Rosenfeld, “Detection and location of people in video images using adaptive fusion of color and edge information,” in Proc. Int. Conf. Pattern Recognition, pp. 627−630, 2000.

[8] T. Horprasert, D. Harwood, and L.S. Davis, “A statistical approach for real-time robust background subtraction and shadow detection,” in Proc. IEEE

ICCV’ 99, 1999.

[9] R. Cucchiara, C. Grana, M. Piccardi and A. Prati, “Improving shadow suppression in moving object detection with HSV color information,” in Proc.

IEEE Intelligent transportation System Conference, pp. 334−339, 2001.

[10] A. Prati, I. Mikic, M. Trivedi and R. Cucchiara, “Detecting moving shadows:

algorithms and evaluation,” in Proc. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 918−923, 2003.

[11] B. Chen and Y. Lei, “Indoor and outdoor people detection and shadow suppression by exploiting HSV color information,” Fourth International Conference on Computer and Information Technology, pp. 137−142, 2004.

[12] S. Vitabile, G. Pilato, G. Pollaccia, and F. Sorbello, “Road signs recognition using a dynamic pixel aggregation technique in the HSV color space,” in Proc.

11th International Conference on Image Analysis and Processing, pp. 572−577, 2002.

[13] R. Cucchiara, M. Piccardi and A. Prati, “Detecting moving objects, ghosts, and shadows in video streams,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1337−1342, 2003.

[14] K. Ohba, Y. Sato, and K. Ikeuchi, “Appearance-based visual learning and object recognition with illumination invariance,” Machine Vision and Applications, Vol. 12, No. 4, pp. 189−196, 2000.

[15] S. J. Mckenna, Y. Raja, and S. Gong, “Tracking color objects using adaptive mixture models,” Image and Vision Computing 17, pp. 225−231, 1999.

[16] P. KaewTraKulPong, and R. Bowden, “An improved adaptive background mixture model for real-time tracking with shadow detection,” in Proc. 2^nd European Workshop on Advanced Video Based Surveillance System, AVBS01.

Sept 2001.

[17] D. S. Lee, “Effective gaussian mixture learning for video background subtraction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, no.5, pp.827−832, 2005.

[18] L. Li, W. Huang, I. Y. H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” IEEE Transactions of Image Process, vol.13, no.11, pp.1459−1472, 2004.

在文檔中光線適應性背景模型於前景主體抽取 (頁 45-0)