Chapter 3 Multi-objects Tracking System with Adaptive Background Reconstruction
3.2 Foreground Segmentation
3.5.2 Accident Prediction
We also present a prediction module for traffic accidents. We analyze the trajectories of any two objects and classify the relation of the objects into four types: both stopped objects, only one moving object, same moving direction and different moving directions. The equations of classifying are shown in Eq. (23). The prediction module only processed the later three types.
Firstly, for the type I, the closest position is calculated and the distance between these two objects in that position is compared with a preset threshold to predict the occurrence of an accident. The diagram of this type is shown in Fig. 18.
Fig. 18 Diagram of type I of the relation of two objects
The equations of the closest position and the distance are shown in Eq. (24) and (25).
Then we use the equation in Eq. (26) to predict the occurrence of accidents and the occurring position.
Secondly, we analyze the type of same moving direction further and only focus on the situation in which trajectories of the two objects are almost on the same line. The equation Eq.
(27) is used for checking whether the trajectories are almost on the same line with a threshold.
Then the occurrence of accidents and the occurring position are predicted by the equations shown in Eq. (28). This type’s diagram is shown in Fig. 19.
Objectmoving
Fig. 19 Diagram of type II of the relation of two objects
Type At same line
,
Thirdly, if the moving directions of two objects are different, we use the equations shown in Eq. (29) to obtain the time for objects reaching the crossing of their trajectories.
Then we can predict the occurrence of accidents by the equation Eq. (30) and obtain the occurring position. This type’s diagram is shown in Fig. 20.
Δ
Fig. 20 Diagram of type III of the relation of two objects Object #2
Object #1
V2 V1
Chapter 4
Experimental Results
We implemented our tracking system with PC system and the equipment of this PC is Intel P4 2.4G and 512MB RAM. The software we used is Borland C++ Builder 6.0 on Window 2000 OS. The inputs are video files (AVI uncompressed format) or image sequences (BMP format) and all the inputs are in the format of 320 by 240. These inputs were captured by ourselves with a DV at traffic intersection or referred to testing samples which were used by other research.
In the section 4.1, we will show the experimental results of background reconstruction and foreground segmentation. In section 4.2, the results of objects tracking will be represented.
In section 4.3, the implemented system will be presented and the results of extraction of traffic parameters are also demonstrated.
4.1 Results of Background Reconstruction and Foreground Segmentation
4.1.1 Background Initialization
We use a video file which was captured at traffic intersection to present our proposed algorithm for background initialization. The threshold of appearing is preset as 15 and the duration of initializing is 200 frames. As the result, there are 76416 pixels which had fully updated and the updated ratio is 99.5%. Those partial updated pixels only account for 0.5%
and will be updated soon by background reconstruction module.
(a): Frame #1 (b): Frame #6
(c): Frame #16 (d): Frame #26
(e): Frame #36 (f): Frame: #46
(g): Frame #66 (h): Frame #86
(i): Frame #86 (j): Frame # 166 Fig. 21 Temporary background images during background initialization
Some temporary background images which were created during background initialization are shown in Fig. 21. We can find that moving objects which can be removed automatically by our initializing algorithm in images (a) ~ (h). The images (i) and (j) are locally enlarged images and there are some small regions which hadn’t been updated in image (i). Then those regions had been updated well in image (j). This is the reason why the duration of initialization that is preset as 200 frames and we intend to construct robust initial background image.
4.1.2 Adaptive Background Updating
The main purpose of background updating module is to reconstruct a robust background image. But there are some situations that result in wrong background images easily at outdoor environment. We use the datasets of PETS (Performance Evaluation of Tracking and Surveillance) to perform the adaptive background updating algorithm. The video is PETS2001 [43] dataset2 training video and it’s prepared for its significant lighting variation.
We also performed the tracking on this video using fixed background updating method with two different updating frequencies in order to compare with our proposed algorithm.
The experimental results are shown in Fig. 22 and the #XXXX–a images show the results using our adaptive updating algorithm. The leading number means the index of image sequences. The upper images are images of results of background subtraction and the lower
images are current background images. Besides, the –b images show the results using fixed updating frequency of per 30 frames and the –c images show the results using the frequency of per 10 frames. We focus on the right-bottom corner of each image. Because the car entered camera’s FoV view slowly, some pixels of its region had been regarded as background. This symptom will become more deteriorated if we speeded up the updating frequency. In frame
#2212 and #2234, we can find the ghost region in background image. The 2234R images are the tracking result image of frame #2234 and the wrong segmentation occurred in #2234-c image. It shows that fast updating frequency more easily results in wrong segmentation than slow updating frequency in stable environmental condition.
(#2212-a) (#2212-b) (#2212-c)
(#2234-a) (#2234-b) (#2234-c)
(#2234R-a) (#2234R-b) (#2234R-c) Fig. 22 Experimental results of background updating (#2212~#2234)
Next, we demonstrated the situation of significant lighting variation and this situation occurred in the later frames of this same video.
#3300-a #3300-b #3300-c
#3392-a #3392-b #3392-c
#3472-a #3472-b #3472-c
#3580-a #3580-b #3580-c
Fig. 23 Experimental results of background updating (#3300~#3580)
In Fig. 23, the upper images are the images of tracking results and the lower images are the images of background subtraction. In frame #3300, the results of using the slow fixed updating method (-b images) occurred wrong segmentation but the other two method updated background fast enough to cope with lighting changes. In frame #3392 and #3472, the results of using fast fixed updating frequency method(-c images) also occurred some wrong segmentations and the slow one still tracked the wrong objects. In 3580 frame, the background images of using our proposed method and fast fixed updating method had converged to stable images but there was still a quite difference between the correct background and the background image of using slow fixed updating method. The results
prove that our proposed updating algorithm presents good balance and better performance among the normal environmental conditions, gradual change situations and rapid change situations.
4.2 Results of Objects Tracking
4.2.1 Objects Tracking
In this section, we demonstrated the results of tracking algorithm with several images sequences.
Frame #500 Frame #540
Frame #580 Frame #620
Frame #660 Frame #700
At first, we show the results of objects tracking with a real-life traffic video at an intersection scene. In Fig. 24, the moving objects are detected and tracked robustly and the correct labels are assigned to the same objects continuously during the tracking. The next video is the PETS2001 dataset1 training video. With this video, we also perform good segmentation and robust tracking of moving objects and the results are shown in Fig. 25
Frame #2760 Frame #2800
Frame #2840 Frame #2880
Frame #2920 Frame #2960
Fig. 25 Tracking results of PETS2001 dataset1 (#2760 ~ #2960)
4.2.2 Occlusion and Splitting of Multiple Objects
Our proposed tracking algorithm can handle the occlusion and splitting events robustly
even through there are various objects or more than two objects included the occlusion group.
In Fig. 26 and 27, we show the tracking results of objects with occlusion and splitting events.
Frame #840 Frame #850
Frame #860 Frame #880
Frame #900 Frame #910
Fig. 26 Experimental results of objects with occlusion and split events
There were occlusion and splitting events occurred with the two vehicles (No. 44 and No.
48) in Fig. 22. Although the color and shape of the two vehicles are similar, our algorithm can reason the splitting according to other features of objects and assign the correct labels to the objects. We also show more complicated occlusion events in Fig. 23. There are an occlusion
of four vehicles occurred during frame #1200 to #1350 in Fig. 23. No. 65 is an object with an implicit occlusion so our system treats it as a single object. During frame #1250 to #1270, those vehicles merged together. Then No. 68 vehicle had split from the occlusion group and was tracked with correct label in frame #1272. No. 73 was created as a new object because it split from an object with implicit occlusion. No. 61 also had split from the group correctly in frame #1294. No. 62 didn’t split from the occlusion group because it still merged with another object when it left the camera’s FoV.
Frame #1200 Frame #1240
Frame #1254 Frame #1258
Frame #1260 Frame #1270
Frame #1272 Frame #1294
Frame # 1320 Frame #1340
Fig. 27 Experimental results of occlusion and split of multiple objects
4.2.3 Stopped Objects
There are often stopped objects or the moving background occurred during the tracking or monitoring. Stopped objects usually result from vehicles or motorcycles waiting to turn direction or parking. If stopped vehicles start to move, they will result in the occurrence of the moving background or the ghost foreground. We propose a sleep_life feature to cope with these special situations. According to the characteristics of surveillance environments, we can preset the threshold of sleep_life feature to let stopped objects or the ghost foreground be updated to background image after a specific duration. We demonstrated tracking results of stopped objects in Fig 28.
Frame #900 Frame #960
Frame #1100 Frame #1140
Frame #1180 Frame #1300
Frame # 1380 Frame #1420
Frame #1540 Frame #1560
Frame #1580 Frame #1580 (Background)
Frame #1620 Frame #1620 (Background)
Frame #1640 Frame #1640 (Background)
Frame #1750 Frame #1750 (Background) Fig. 28 Tracking results of stopped objects
In Fig. 28, there was a car (No. 2) parked on campus originally and it intended to leave.
Another car (No.3) intended to park on this campus. We can find a ghost object (No. 8) occurred in frame #1420 because the car (No.2) was regarded as a region of the background previously. In frame #1620, the system duplicated the region of ghost object into the background image when its sleep_life feature was larger than the threshold. In frame #1750, the car (No.3) which had parked on campus was also regard as the background.
4.3 Behavior Analysis
Fig. 29 The outlook of implemented program
We had implemented a simple program to present the algorithms and the system we proposed. We also use this program to display some traffic parameters which we extracted
from the tracking system or behavior analysis module. The outlook of the implemented program is shown in Fig. 29.
We implement a simple but effective method to classify those moving objects into four categories: people, motorcycles, cars and large cars. This method must be processed with the correct information of camera calibration. In Fig. 30, we show experimental results of objects classification. We designed four counters that present the accumulating quantity of different categories of moving objects and the bounding boxes of objects are drawn with the specific colors directly: people’s are with red color, motorcycles’ are with yellow color, cars’ are with blue color and large cars’ are with green color.
(a)
(b)
Fig. 30 Experimental results of objects classification
Next, we use the information of trajectory estimation of the objects to implement the Counters
prediction of traffic accidents. The results are shown in Fig. 31. We will draw an accident box and two predicting trajectories in the images of tracking results if our system predicts the occurrence of an accident. Because there isn’t a real accident happened in frame #128, the accident alarm will disappear soon in the later frames.
(a) Frame #126
(b) Frame #128
Fig. 31 Experimental results of accident prediction Predicting
Trajectories
Chapter 5 Conclusions
We present a real-time multi-objects tracking system combined in this thesis. At first, we propose a segmentation algorithm with adaptive background reconstruction technique. This technique helped our system cope with the gradual and the sudden changes of the outdoor environment specially result from lighting variation. Next, we introduce a simple but effective and generalized tracking algorithm which combines region-based method and feature-based method. The algorithm uses the overlap area as the first matching criterion and simplifies the matching relations into several categories. Then we also design a reasoning framework that can analyze and deal with occlusion and splitting events of multiple and various objects robustly.
According to the structure of the algorithms we proposed, we implemented a tracking system including the functions of objects classification and accident prediction. Experiments were conducted on real-life traffic video of the intersection and the datasets of other surveillance research. We had demonstrated robust objects segmentation and successful tracking of various objects with occlusion or splitting events by our system. The system also extracted useful traffic parameters. Besides, those properties of moving objects or the results of behavior analysis are valuable to the monitoring applications and other surveillance systems.
In the future, there are still some improvements can be made in order to improving the performance and the robustness of the tracking system. Firstly, we can improve the segmentation algorithm so that it is more robust to complex scenes. Secondly, it’s also important to refine the object’s features according to the characteristics of various objects for
handling more complicated occlusion and splitting events robustly. Thirdly, using the scene information or the camera model effectively will provide more accurate surveillance information or traffic parameters. Besides, the segmentation of foreground can help reduce the data flow of transmission of surveillance video if we only deliver foreground information with the advanced compression method. The object’s properties are also proper material for high level behavior recognition of moving objects and further this work will assist surveillance system in constructing the content retrieval and management structure.
References
[1] W. Brand, “Morphable 3D models from video,” in Proc. of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2001, pp. II-456 - II-463.
[2] D. Meyer, J. Denzler, and H. Niemann, “Model based extraction of articulated objects in image sequences for gait analysis,” in Proc. of the IEEE International Conference on Image Processing, vol. 3, Oct. 1997, pp. 78 – 81.
[3] J. L. Barron, D. J. Fleet, S. S. Beauchemin, and T. A. Burkitt, “Performance of optical flow techniques,” in Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 1994, pp. 236 - 242.
[4] A. J. Lipton, H. Fujiyoshi, and R. S. Patil, “Moving target classification and tracking from real-time video,” in Proc. of the IEEE Workshop on Applications of Computer Vision, Oct.
1998, pp. 8 – 14.
[5] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: real-time surveillance of people and their activities,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no.
8, pp. 809 – 830, Aug. 2000.
[6] A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis, “Nonparametric kernel density estimation for visual surveillance,” in Proc. of the IEEE, vol. 90, No. 7, July 2002, pp.
1151 – 1163.
[7] S. Kamijo and M. Sakauchi, “Segmentation of vehicles and pedestrians in traffic scene by spatio-temporal markov random field model,” in Proc. of the IEEE International Conference on Acoustics, Speed, & Signal Processing, vol. 3, Apr. 2003, pp. III - 361-4.
[8] J. Kato, T. Watanabe, S. Joga, Y. Liu, and H. Hase, ”An HMM/MRF-based stochastic framework for robust vehicle tracking,” IEEE Transactions on Intelligent Transportation
[9] P. L. Rosin and T. J. Ellis, “Detecting and classifying intruders in image sequences,” in Proc. 2nd British Machine Vision Conference, 1991, pp. 293–300.
[10] N. Friedman and S. Russell, “Image segmentation in video sequences: a probabilistic approach,” in Proc. 13th Conference Uncertainty in Artificial Intelligence, 1997, pp.
175–181.
[11] C. Ridder, O. Munkelt, and H. Kirchner, “Adaptive background estimation and foreground detection using Kalman filtering,” in Proc. Int. Conference Recent Advances Mechatronics, 1995, pp. 193–199.
[12] C. Stauffer and W. E. L. Crimson, “Adaptive background mixture models for real-time tracking,” in Proc. of Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, June 1999.
[13] S. McKenna, S. Jabri, Z. Duric, A. Rosenfeld, and H. Wechsler, “Tracking groups of people,” Computer Vision and Image Understanding, vol. 80, no. 1, pp. 42–56, 2000.
[14] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detecting moving objects, ghosts, and shadows in video streams,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1337 – 1342, Oct. 2003.
[15] L. Li, W. Huang, I. Y. H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” IEEE Transactions on Image Processing, vol.
13, no. 11, pp. 1459 – 1472, Nov. 2004.
[16] W. Hu, T. Tan, L. Wang, and S. Maybank, “A survey on visual surveillance of object motion and behaviors,” IEEE Transactions on Systems, Man, and Cybernetics—Part C:
Applications and Reviews, vol. 34, no. 3, pp. 334 – 352, Aug. 2004.
[17] S. Gupte, O. Masoud, R. F. K. Martin, and N. P. Papanikolopoulos, “Detection and classification of vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 3, no. 1, pp. 37–47, Mar. 2002.
[18] D. Koller, J. Weber, T. Huang, J. Malik, G. Ogasawara, B. Rao, and S. Russell, “Towards robust automatic traffic scene analysis in real-time,” in Proc. of the 12th IAPR International Conference on Pattern Recognition, vol. 1, 1994, pp. 126 – 131.
[19] A. Chachich, A. Pau, A. Barber, K. Kennedy, E. Olejniczak, J. Hackney, Q. Sun, and E.
Mireles, “Traffic sensor using a color vision method,” in Proc. of SPIE: Transportation Sensors and Controls: Collision Avoidance, Traffic Management, and ITS, vol. 2902, pp.
156–165, 1996.
[20] T. N. Tan, G. D. Sullivan, and K. D. Baker, “Model-based localization and recognition of road vehicles,” International Journal of Computer Vision, vol. 27, no. 1, pp. 5–25, 1998.
[21] R. Cucchiara, P. Mello, and M. Piccaidi, “Image analysis and rule-based reasoning for a traffic monitoring system,” IEEE Transactions on Intelligent Transportation Systems, vol. 1, no. 2, pp. 119–130, June 2000.
[22] A. Elgammal, R. Duraiswami, and L. S. Davis, “Efficient kernel density estimation using the fast gauss transform with application to color modeling and tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 11, pp. 1499 – 1504, Nov. 2003.
[23] H. Veeraraghavan, O. Masoud, and N. P. Papanikolopoulos, “Computer vision algorithms for intersection monitoring,” IEEE Transactions on Intelligent Transportation Systems, vol. 4, no. 2, pp. 78 - 89, June 2003.
[24] S. C. Chen, M. L. Shyu, S. Peeta, and C. Zhang, “Learning-based spatio-temporal vehicle tracking and indexing for transportation multimedia database systems,” IEEE Transactions on Intelligent Transportation Systems, vol. 4, no. 3, pp. 154 – 167, Sep. 2003.
[25] P. Kumar, S. Ranganath, K. Sengupta, and W. Huang, “Co-operative multi-target tracking and classification,” in Proc. of European Conference on Computer Vision, May 2004, pp. 376–389.
[26] S. K. Zhou, R. Chellappa, and B. Moghaddam, “Visual tracking and recognition using
appearance-adaptive models in particle filters,” IEEE Transactions on Image Processing, vol.
13, no. 11, pp. 1491 – 1506, Nov. 2004.
[27] H. T. Nguyen and A. W. M. Smeulders, “Fast occluded object tracking by a robust appearance filter,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 1099 – 1104, Aug. 2004.
[28] J. Kang, I. Cohen, and G. Medioni, “Continuous tracking within and across camera streams,” in Proc. of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, June 2003, pp. I-267 - I-272.
[29] M. Fathy and M. Y. Siyal, ”A window-based image processing technique for quantitative and qualitative analysis of road traffic parameters,” IEEE Transactions on Vehicular Technology, vol. 47, no. 4, pp. 1342 – 1349, Nov. 1998.
[30] Y. K. Jung and Y. S. Ho, “Traffic parameter extraction using video-based vehicle tracking,” in Proc. of 1999 IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems, 1999, pp. 764 – 769.
[31] P. Kumar, S. Ranganath, W. Huang, and K. Sengupta, “Framework for real-time behavior interpretation from traffic video,” IEEE Transaction Action on Intelligent Transportation Systems, vol. 6, no. 1, pp. 43 – 53, Mar. 2005.
[32] M. Haag and H. H. Nagel, “Incremental recognition of traffic situations from video image sequences,” Image and Vision Computing, vol. 18, no. 2, pp. 137-153, 2000.
[33] P. Remagnino, T. Tan, and K. Baker, “Multi-agent visual surveillance of dynamic scenes,” Image and Vision Computing, vol. 16, no. 8, pp. 529-532, 1998.
[34] Z. Q. Liu, L. T. Bruton, J. C. Bezdek, J. M. Keller, S. Dance, N. R. Bartley, and Cishen Zhang, “Dynamic image sequence analysis using fuzzy measures,” IEEE Transactions on Systems, Man and Cybernetics, part B, vol. 31, no. 4, pp. 557 – 572, Aug. 2001.
[35] H. Ikeda, Y. Kaneko, T. Matsuo, and K. Tsuji, “Abnormal incident detection system
employing image processing technology,” in Proc. of 1999 IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems, 1999, pp. 748 – 752.
[36] M. M. Trivedi, I. Mikic, and G. Kogut, “Distributed video networks for incident detection and management,” in Proc. of 2000 IEEE on Intelligent Transportation Systems, 2000, pp.
155 – 160.
[37] S. Kamijo, Y. Matsushita, K. Ikeuchi, and M. Sakauchi, “Traffic monitoring and accident detection at intersections,” IEEE Transactions on Intelligent Transportation Systems, vol. 1, no. 2, pp. 108 – 118, June 2000.
[38] C. P. Lin, J. C. Tai, and K. T. Song, “Traffic monitoring based on real-time image tracking,” in Proc. of the IEEE International Conference on Robotics and Automation, vol. 2, 2003, pp. 2091 – 2096.
[39] W. Hu, X. Xiao, D. Xie, T. Tan, and S. Maybank, “Traffic accident prediction using 3-D model-based vehicle tracking,” IEEE Transactions on Vehicular Technology, vol. 53, no. 3, pp.
[39] W. Hu, X. Xiao, D. Xie, T. Tan, and S. Maybank, “Traffic accident prediction using 3-D model-based vehicle tracking,” IEEE Transactions on Vehicular Technology, vol. 53, no. 3, pp.