Overall Object Tracking Process - Multi-Blob Model Based Mean-Shift Tracking

Chapter 3. Mean-Shift Tracking and Multi-Blob Model

3.2 Multi-Blob Model Based Mean-Shift Tracking

3.2.8 Overall Object Tracking Process

To sum up, the flow chart of the proposed object tracking procedure is shown in Figure 3.20. Compared to the flow chart of the traditional mean-shift tracking process in Figure 3.3, we need not to build the candidate during the mean-shift iterations. The location converge condition is y₁−y₀ ² < and the orientation converge condition is 1

( )

det 1 0.05

det Vnew

V − < , which means the amount of samples varying is less than 5%.

Figure 3.20 Flow chart of the proposed object tracking process.

Furthermore, we can do some simple predictions between frames or between mean-shift iterations to shorten the processing time. On the other hand, the judgment of

target loss is an additional stage to increase robustness. We have two simple rules to judge whether the tracking failure or not. The first rule is to check the covariance matrix V obtained by Eq. 3-40. If^{det V}

( )

^≤0, we are not able to define the bounding ellipse. Recall Eq. 3-17, a negative area is not reasonable. In practice, instead of the bounding ellipse, a hyperbola is obtained. The second rule is to check the reliability. If all blobs are with low reliability, the tracking result is likely to be a failed one.

Chapter 4. Experimental Results

We present some object tracking results using the proposed algorithm in this chapter.

The Resolution of all sequences is 640 480× . The algorithm is implemented with the MATLAB 6.5 platform and runs on a 3GHz Pentium4 PC with 512MB DRAM. The red ellipse and blue bounding box represent the result of proposed algorithm and the traditional mean-shift process, respectively.

In the first experiment, our mean-shift algorithm was run on the sequence “Hans”.

There is neither scene change nor occlusion in this sequence. The tracking result is shown below. The reliability map is also shown to show how the reliabilities change during tracking. The multi-blob model is built at Frame 35. All the reliabilities are initialized to 1.

In addition, we ran the traditional mean-shift procedure at the same time, which employs the

“plus or minus 10 percent” scale adaptation method and uses a 16 16 16× × histogram in the RGB space as the model.

Frame 35 Frame 40

Frame 50 Frame 75

Frame 120 Frame 160

Frame 195 Frame 235

Frame 295 Frame 360

Frame 420 Frame 485

Frame 515 Frame 590

Frame 630 Frame 695

Frame 765 Frame 850

Figure 4.1 Experimental results of the sequence “Hans”. The orientation of the bounding ellipse and the reliability of blobs can be updated during tracking.

We use the RGB color space and separate each channel into 8 bins; therefore, our target model contains 512 blobs. The color of pants belongs to Blobs 147 and 148 and the color of the iron shelves belongs to Blob 148. Therefore, the reliability of Blob 148 should be high when the moving object is far away from the iron shelves but low when they are close. Figure 4.2 shows the reliability of these two blobs are properly updated according to the background information.

Figure 4.2 The reliability of Blobs 147 and 148. They are updated according to the background information.

In the second experiment a more complex clip is tested. Occlusion and scene change appears. Moreover, the color of cloth is very close to some parts of the background. To test the robustness of our method, we only separate each channel into 4 bins; therefore, our target model contains 64 blobs. Figure 4.3 shows the tracking result. We mark the moving object by the green ellipse instead of the red one when updating the target model.

Frame 35 Frame 55

Frame 65 Frame 115

Frame 140 Frame 185

Frame 270 Frame 400

Frame 470 Frame 505

Frame 570 Frame 595

Frame 630 Frame 660

Frame 710 Frame 775

Frame 830 Frame 885

Figure 4.3 The tracking result of sequence Watson. The target model updates at frame 65.

Due to the distraction caused by the background, the target model has to update at appropriate moments to track successfully. We use the variance ratio to evaluate the degree of appropriateness. Figure 4.4 shows the variance ratio of the tracking result.

Figure 4.4 The variance ratio of tracking result.

Since all reliabilities are initialized to be 1, variance ratio at Frame 35 is 0.5. Variance ratio changes as the update of reliabilities. At Frame 65, the variance ratio reached the peak value 0.6743 and the algorithm update the target model. The variance ratio drops rapidly when the moving object is close to the door or the shelves or when the occlusion happens.

In the third experiment we ran the same code in the second experiment on the sequence Wesar. This sequence contains zooming and the moving object moved away from the camera. Hence, the size of moving object decreased through the sequence. Figure 4.5 shows the result, where the model was build on Frame 35.

Frame 35 Frame 70

Frame 105 Frame 140

Frame 210 Frame 260

Frame 300 Frame 355

Figure 4.5 The localization of the traditional mean-shift process is poor when the object’s size decreases.

Due to the severe scene change and the complex background, size and orientation update is not stable. The reliability update is not sensitive enough under this situation.

The definition of target model affects not only the localization, but also the number of iterations. The better localization will cause a steeper similarity surface and more protruding peaks. Hence, less iterations are needed.

Figure 4.6 Number of iterations when executing the proposed mean-shift process (red) and the traditional mean-shift procedure (blue).

In the final experiment, we use the sequence Stan, which contains a great difference in luminance. A complicated background and scene change also appears in the sequence. In Figure 4.7, the object moved from a shadow region to a bright region at Frame 65. The severe change in luminance may cause tracking failure.

In prior researches, other features are used to increase robustness. For example, [13]

use 2D image gradients as features, [16] use HSV color space and [17] use so-called excess color features, such as 2G-R-B, etc. However, to select different features to use in different cases is cumbersome. In Section 3.2.8, we proposed two simple rules to judge whether the tracking fails or not.

Whenever the tracker loses the target, our algorithm will restart the tracking process automatically by detecting the moving object again. Figure 4.7 shows the experimental result, where the red ellipse indicates the target. The color is switched from red to green to indicate the restart moment of motion detection..

Frame 5 Frame 35

Frame 65 Frame 75

Frame 100 Frame 165

Frame 185 Frame 210

Frame 220 Frame 250

Figure 4.7 The experimental result, where the red ellipse indicates the target. Due to the target loss, the algorithm detected the object at Frame 75, Frame 185 and Frame 220. We switch the color from red to green at these frames.

Chapter 5. Conclusions

We proposed a complete object tracking algorithm including motion detection, motion tracking and target updating. We have presented a new model to describe the target, which contains information in both spatial domain and feature domain. Based on the multi-blob model, we define a similarity measurement and a mean-shift tracking procedure. After the location of moving object has been tracked by mean-shift, we design a process to modify the size and orientation of bounding ellipse. To improve the robustness of the system, we set a target model updating criterion and some rules to check whether the tracking fails or not.

The proposed object tracking system can deal with the case of scene change and occlusion. An outdoor sequence with severe change in luminance is also tested. Due to the localization of multi-blob model and the discriminative similarity measurement, the proposed algorithm converges faster than the traditional one. The size and orientation update is also achieved.

Reference

[1] Comaniciu D. ,Ramesh V. ,Meer P. , “Kernel-based object tracking”, in Proc. IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-. 575, May 2003.

[2] W. E. L. Grimson, C. Stauffer, R. Romano, L. Lee, “Using adaptive tracking to classify and monitor activities in a site”, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Santa Barbara, CA, 1998, pp. 22–31.

[3] A. Mittaland, D. Huttenlocher, “Scene modeling for wide area surveillance and image synthesis”, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, 2000, pp. 160–167.

[4] T. Wada, T. Matsuyama, “Appearance sphere: Background model for pan-tilt-zoom camera”, in Proc. 13th Int. Conf. Pattern Recognition pp.A-718-A-722, Vienna, Austria, 1996.

[5] Elgammal, A., Duraiswami, R., Harwood, D., Davis, L.S., “Background and foreground modeling using nonparametric kernel density estimation for visual surveillance”, in Proc. IEEE Volume 90, Issue 7, July 2002 Page(s):1151 – 1163 [6] A. J. Lipton, H. Fujiyoshi, R. S. Patil, “Moving target classification and tracking from

real-time video”, in Proc. IEEE Workshop Applications of Computer Vision, 1998, pp.

8–14.

[7] C. R. Wren, A. Azarbayejani, T. Darrell, A. P. Pentland, “Pfinder: real-time tracking of the human body”, in Proc. IEEE Trans. Pattern Anal. Machine Intell., vol. 19, pp.

780–785, July 1997.

[8] S. McKenna, S. Jabri, Z. Duric, A. Rosenfeld, H. Wechsler, “Tracking groups of people”, in Proc. Comput. Vis. Image Understanding, vol. 80, no. 1, pp. 42–56, 2000.

[9] Hironobu Fujiyoshi, Alan J. Lipton, ”Real-time human motion analysis by image skeletonization”, in Proc. 4th IEEE Workshop on Applications of Computer Vision (WACV'98), pp. 15-21, October 19 - 21, 1998 , Princeton, New Jersey.

[10] N. Peterfreund, “Robust tracking of position and velocity with Kalman snakes”, in Proc. IEEE Trans. Pattern Anal. Machine Intell., vol. 22, pp. 564–569, June 2000.

[11] I. A. Karaulova, P. M. Hall, A. D. Marshall, “A hierarchical model of dynamics for tracking people with a single video camera”, in Proc. British Machine Vision Conf., 2000, pp. 262–352.

[12] D.-S. Jang, H.-I. Choi, “Active models for tracking moving objects”, in Proc. Pattern Recognit., vol. 33, no. 7, pp. 1135–1146, 2000.

[13] Changjiang Yang, Duraiswami, R., Davis, L., ”Efficient mean-shift tracking via a new similarity measure”, in Proc. Computer Vision and Pattern Recognition, 2005. CVPR

2005. IEEE Computer Society Conference on Volume 1, 20-25 June 2005 Page(s):176 - 183 vol. 1 Digital Object Identifier 10.1109/CVPR.2005.139

[14] Birchfield, S.T., Sriram Rangarajan, “Spatiograms versus histograms for region-based tracking”, in Proc. Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on Volume 2, 20-25 June 2005 Page(s):1158 - 1163 vol.

[15] Collins, R.T., “Mean-shift blob tracking through scale space”, in Proc. Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on Volume 2, 18-20 June 2003 Page(s): II - 234-40 vol.2

[16] Zivkovic, Z., Krose, B., “An EM-like algorithm for color-histogram-based object tracking”, in Proc. Computer Vision and Pattern Recognition, 2004. CVPR 2004.

Proceedings of the 2004 IEEE Computer Society Conference on Volume 1, 27 June-2 July 2004 Page(s):I-798 - I-803 Vol.1

[17] Robert T. Collins, Yanxi Liu, Marius Leordeanu, "Online Selection of Discriminative Tracking Features", in Proc. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1631-1643, Oct., 2005.

在文檔中基於多團塊模型及平均移動法之物體追蹤技術 (頁 55-0)