Occlusion handler - Human tracking - 應用於主動式攝影機上的權重式重取樣粒子濾波器的人形追蹤

Chapter 3 Human tracking

3.6 Occlusion handler

Normally, particle filter can handle some occlusion condition, but it depends on the range of samples which spread in the spatial space. If some samples spread in the location where human appeared after occlusion, then the human still can be tracked continuously. On the other hand, if the occlusion happened and spread range of samples is too small to cover the region where human appears, then resample will lead to track lost.

The occlusion handler in our work is based on color similarity of target and candidate model. The details are described below.

1. Create candidate model 𝑐 = {𝑐^(𝑢)}_{𝑢=1…𝑚} from the ROI in current frame.

2. Compute similarity value between target model 𝑞^′= {𝑞^′(𝑢)}

𝑢=1…𝑚 and candidate model 𝑐 = {𝑐^(𝑢)}_{𝑢=1…𝑚}.

3. If 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 < 𝑡ℎ𝑠_𝑠𝑖𝑚, then do not process the resample step. Assume candidate model is occluded with other object.

4. Add counter 𝐶𝑜𝑢𝑛𝑡 = 𝐶𝑜𝑢𝑛𝑡 + 1.

5. During tracking process, the step 1~4 are iterated until the tracking human has appeared (similarity value larger than ths_sim) or 𝐶𝑜𝑢𝑛𝑡 ≥ 10 which avoided the samples spread out of image.

6. Then the resample step is restarted.

(a) frame 681 (b) frame 685

Fig. 3-7 Occlusion handler (a) frame 681 (b) frame 685 (c) frame 690 (d) frame 695

PF similarity

similarity

< 0.7 Count < 10

Yes

Weighted resampling

End Count=0

Count++

Yes

Occlusion handler

Fig. 3-8 Occlusion handler flow chart

Fig. 4-1 Active camera control through RS485

The active camera is controlled by pelco P-protocol [34] through RS-232 to RS-485 converter. It has to control pan (horizontal direction), tilt (vertical direction) angle, and zoom’s step to achieve tracking purpose.

The pelco P-protocol has 8 bytes data with message format as shown in Fig. 4-2.

Byte1 and byte7 is start and stop byte, and always set to 0xA0 and 0xAF respectively.

Byte2 is the receiver or camera address. In this thesis, we only use one camera, so byte2 always set to 0x00. Byte3, byte4, byte5, byte6 are used to control pan-tilt-zoom (PTZ) as shown in Fig. 4-3. The last byte is an XOR check sum byte.

Data byte 2 Data byte 3 Data byte 4

Fig. 4-2 Message format

Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0

Data byte1 Fixed to 0 Camera On Auto Scan On

Camera On/Off

Iris Close Iris Open Focus Near Focus Far

Data byte2 Fixed to 0 Zoom Wide Zoom Tele Tilt Down Tilt Up Pan Left Pan Right 0 (for pan/tilt) Data byte3 Pan speed 00 (stop) to 3F (high speed) and 40 for Turbo

Data byte4 Tilt speed 00 (stop) to 3F (high speed)

Fig. 4-3 Data byte 1 to 4 format

In this thesis, we divide the image into 9 regions associated with pan-tilt directions, and keep moving object in the center of FOV. Every region has specific direction as shown in Fig. 4-4. If the target is located on stop-region, then camera is set to stop. Meanwhile, the camera speed on other regions is determined by PID controller. The zoom-in and zoom-out will be activated if the target’s size becomes smaller or larger than user’s defined size. The details of camera control are showed in Fig. 4-5.

S T O P

Fig. 4-4 Control direction for each regions

27 Tracking target

Target size >

upper bound Zoom out command

Target size <

lower bound Zoom in command

Target position in

center region Stop command

PID control

Pan / Tilt command

Update upper bound

& lower bound &

target size

Send PTZ command Yes

Yes

Yes No

Fig. 4-5 Camera control flow chart

4.1 PID controller

A proportional-integral-derivative controller (PID controller) is a generic control loop feedback mechanism (controller) widely used in industrial control systems [35].

The diagram of PID is showed in Fig. 4-6.

Fig. 4-6 The PID controller

In the PID control system, the monitored Plant/Process is hoped to keep one ideal state. The measured value of Plant/Process is 𝑦(𝑡) which is sent to the comparer to compare with setting value 𝑢(𝑡). If the Plant/Process has affected by disturbance, the measured value is not equal to setting value and the comparer will produce error signal 𝑒(𝑡). The error signal 𝑒(𝑡) is sent to controller. The controller produces output signal 𝐶_𝑜𝑢𝑡 to correct Plant/Process and make it returns to ideal state.

The output signal 𝐶_𝑜𝑢𝑡 is defined by following equation.

𝐶_𝑜𝑢𝑡 = 𝐾_𝑝𝑒(𝑡) + 𝐾_𝐼∫ 𝑒(𝑡) 𝑑𝑡 + 𝐾𝐷𝑑𝑒(𝑡)

𝑑𝑡 (4.1)

where 𝐾_𝑝 is proportional constant, 𝐾_𝐼 is integral constant, and 𝐾_𝐷 is derivative constant.

The controller consists proportional controller (P controller), integral controller (I controller), and derivative controller (D controller).

1. P controller: is error signal 𝑒(𝑡) multiplied by 𝐾_𝑝. The Plant/Process which has been disturbed can be corrected by this controller, but there are some small eternal error cannot solve by this controller.

2. I controller: is the integral of error signal 𝑒(𝑡) with time. In other words, it multiplies error with its existed time. Also, it can correct the small eternal error which P controller cannot overcome. This controller can use the accumulated integrals with time to make disturbed Plant/Process recover to setting value 𝑢(𝑡).

3. D controller: is the differential of error signal 𝑒(𝑡). Due to this operation, the system has the perspective and can predict the Plant/Process which has large variation.

The corresponding variables of PID controller in our work are defined as follows:

Setting value 𝑢(𝑡): the center position of image.

Error signal 𝑒(𝑡): the difference of center position and target position.

Measured value 𝑦(𝑡): the target position which estimated by tracking system.

Output signal 𝐶_𝑜𝑢𝑡: the output is transferred to pan / tilt speed.

We use two independent PID controllers to control horizontal and vertical position difference, and estimate the speed of pan and tilt. The 𝐶_𝑜𝑢𝑡 is converted to pan speed and tilt speed by Eq. 4.2, Eq. 4.3.

𝑆𝑝𝑒𝑒𝑑_𝑝𝑎𝑛 = 𝐶_𝑜𝑢𝑡 ∗ 0.1 + 𝑜𝑓𝑓𝑠𝑒𝑡_𝑝𝑎𝑛 (4.2)

𝑆𝑝𝑒𝑒𝑑_{𝑡𝑖𝑙𝑡} = 𝐶_𝑜𝑢𝑡 ∗ 0.1 + 𝑜𝑓𝑓𝑠𝑒𝑡_{𝑡𝑖𝑙𝑡} (4.3)

𝑜𝑓𝑓𝑠𝑒𝑡_𝑝𝑎𝑛 = { 𝑜𝑓𝑓𝑠𝑒𝑡_𝑝𝑎𝑛, 𝐶_𝑜𝑢𝑡 ≥ 0

−𝑜𝑓𝑓𝑠𝑒𝑡_𝑝𝑎𝑛, 𝐶_𝑜𝑢𝑡 < 0 (4.4)

𝑜𝑓𝑓𝑠𝑒𝑡_{𝑡𝑖𝑙𝑡} = { 𝑜𝑓𝑓𝑠𝑒𝑡_{𝑡𝑖𝑙𝑡}, 𝐶_𝑜𝑢𝑡 ≥ 0

−𝑜𝑓𝑓𝑠𝑒𝑡_{𝑡𝑖𝑙𝑡}, 𝐶_𝑜𝑢𝑡 < 0 (4.5)

where 𝑜𝑓𝑓𝑠𝑒𝑡_𝑝𝑎𝑛 and 𝑜𝑓𝑓𝑠𝑒𝑡_{𝑡𝑖𝑙𝑡} are defined by user. These values are related to the pan-tilt speed provided by camera’s specifications (0 to 64). The PID controller in Eq.

4.2 and Eq. 4.3 produced limited speed value in a suitable range, because if the speed is set too large, then the camera may drive over the object. The consequence is tracking lost may happen.

30 new width and height are defined as follow.

𝑟𝑎𝑡𝑖𝑜_{𝑤 ℎ}_⁄ =^𝑤_ℎ^{𝑖𝑛𝑖𝑡𝑖𝑎𝑙}

5 Chapter 5

Experimental results

This system was implemented on PC platform with Intel® Core™ i5 CPU 650 @ 3.20GHz, 4GB RAM, and developed in Borland C++ Builder 6.0 on Windows 7. The system has been tested under several environments in order to verify its performance and stability. Both video files (AVI uncompressed format) and image sequences from active camera are tested.

5.1 Track on video file

Three videos have been used to verify the tracking system, with parameter particle filter as follows.

Number of samples 𝑁 = 30

Number of bins in histogram 𝑚 = 6 ∗ 6 ∗ 6 = 216

State covariance (𝜎_𝑥, 𝜎_𝑣_𝑥, 𝜎_𝑦, 𝜎_𝑣_𝑦, 𝜎_𝑤, 𝜎_ℎ) = (2,0.5,2,0.5,0.4,0.8)

1. Video 1 is used to verify the occlusion handler in our system as shown in Fig.

5-1 and Fig. 5-2. Figure 5-1 shows the tracking result without occlusion handler.

The full occlusion condition happens in frame 685. If the particle filter resamples during the full occlusion condition, it may resample on uncorrect positions as shown in frame 689 and tracking will lost in frame 694 and 698. Meanwhile, when the full occlusion happens in the particle filter with occlusion handle, the resample step will not be done immediately. So, the sample set can keep widespread range to track after full occlusion.

Frame 558 Frame 673

Frame 685 Frame 689

Frame 694 Frame 698

Fig. 5-1 Tracking without occlusion handler

Frame 558 Frame 673

Frame 685 Frame 689

Frame 694 Frame 698

Fig. 5-2 Tracking with occlusion handler

2. Video 2 is used to verify the tracking feature. Figure 5-3 shows human wears black jacket walking near a black chair. In this case, the target human has similar color feature with the black chair, but the proposed system still can tracks the target human.

Frame 193 Frame 258

Frame 292 Frame 351

Frame 372 Frame 398

Fig. 5-3 Object has similar color as target human

3. Video 3 is used to verify the tracking performance in complex situation. Figure 5-4 shows the target human is paritial occluded with a chair. The target human does sitting down and stand-up activity, as shown in Fig. 5-4 (b). Moreover, the target human is partial occluded with other human as shown in Fig. 5-4 (c).

(a)

(b)

(c)

Fig. 5-4 (a) frame 1436, 1547 and 1605 (b) frame 2152, 2214 and 2277 (c) frame 2757, 2818 and 2838

5.2 Track on active camera

The active camera sets-up in our laboratory. The complexity of the environment is enough to verify the system while detecting and tracking moving human. The parameters of particle filter and PTZ are set as follows:

Number of samples 𝑁 = 30

Number of bins in histogram 𝑚 = 6 ∗ 6 ∗ 6 = 216

State covariance

(𝜎_𝑥, 𝜎_𝑣_𝑥, 𝜎_𝑦, 𝜎_𝑣_𝑦, 𝜎_𝑤, 𝜎_ℎ) = (10,1,10,1,1,2)

𝑜𝑓𝑓𝑠𝑒𝑡_𝑝𝑎𝑛 = 12 𝑜𝑓𝑓𝑠𝑒𝑡_{𝑡𝑖𝑙𝑡} = 6

Proportional constant 𝐾_𝑝 = 0.9 Integral constant 𝐾_𝐼 = 0.1 Derivative constant 𝐾_𝐷 = 0.15 𝑟𝑎𝑡𝑒_𝑏𝑖𝑔= 1.1

𝑟𝑎𝑡𝑒_{𝑠𝑚𝑎𝑙𝑙} = 0.9

1. Tracking result only by controlling pan and tilt. The experimental result shows the target human is mostly located on camera’s FOV, no matter how he walks.

(a) (b) (c)

(d) (e) (f)

(h) (i) (j)

(k) (l) (m)

Fig. 5-5 Update pan and tilt command to track

2. Tracking to test the zoom-in/out. In this case, the target human is walking away from camera or approaching to the camera.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Fig. 5-6 The effects of zoom in / out

Figure 5-6 (a) shows the target human has been detected and the 𝑍𝑜𝑜𝑚_{𝑙𝑎𝑦𝑒𝑟} is initialized to 0. If there is a zoom-in happened, 𝑍𝑜𝑜𝑚_{𝑙𝑎𝑦𝑒𝑟} is added by 1. On the other hand, 𝑍𝑜𝑜𝑚_{𝑙𝑎𝑦𝑒𝑟} is subtracted by 1 when zoom-out happened. The details of 𝑍𝑜𝑜𝑚_{𝑙𝑎𝑦𝑒𝑟} is showed in Table 5-1.

Table 5-1 Zoom layer varies in Fig. 5-6

(a) (b) (c) (d) (e) (f) (g) (h) (i)

𝒐𝒐 0 0 1 2 1 2 3 2 1

3. Tracking by controlling pan, tilt, and zoom, with target human freely walking in the environment.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

Fig. 5-7 Combination of pan, tilt and zoom in / out

Table 5-2 Zoom layer varies in Fig. 5-7

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l)

𝒐𝒐 0 0 1 1 1 0 0 1 2 1 0 0

4. Tracking a target human which more than one person walking in the same environment.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

Fig. 5-8 Human tracking in multiple objects

5. Tracking a target human which more than one person walking in the same environment.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

Fig. 5-9 Human tracking in multiple objects

6 Chapter 6

Conclusions and Future work

6.1 Conclusions

The experiment results show that the proposed system can track moving human by particle filter algorithm on active camera. Also, the tracking system is able to track the target human when more than one person walking in the same environment.

Moreover, the zoom-in/out adjusts the resolution image of tracking human.

There are several contributions in this research:

1. Our system can exactly distinguish human and nonhuman.

2. The weighted resampling can help particle filter to preserve the samples with high weights.

3. Occlusion handler can solve the temporal full occlusion condition.

4. It can track target human smoothly by using the PID controller to determine the motion of camera.

6.2 Future works

In our system, the moving human can be detected and tracked smoothly and continuously. But there are some situations which will result in tracking lost. For example, the background has significant light changes that will lead to moving human changing its character.

In order to use particle filter with active camera in real-time, we reduces the bins of color histogram and the number of samples, it sometimes affects the accuracy of tracking. It can be solved by using some optimized methods in samples. For example, mean-shift can be used to optimize each sample in particle filter.

The active camera is driven by pelco P protocol and uses PID controller to pan or tilt. The results of driving active camera are successful. But it doesn’t use the result of 𝑣_𝑥 and 𝑣_𝑦 in estimated target state vector 𝑠_{𝑡𝑎𝑟𝑔𝑒𝑡}. The 𝑣_𝑥 and 𝑣_𝑦 can be involved in the speed of pan and tilt to increase the accuracy of camera control.

References

[1] W. J. Gillner, “Motion based vehicle detection on motorways”, Proceedings of the IEEE Intelligent Vehicles '95 Symposium, pp. 483-487, September 1995.

[2] P. H. Batavia, D. A. Pomerleau, and C. E. Thorpe, “Overtaking vehicle detection using implicit optical flow”, Proceedings of the IEEE Transportation Systems Conference, pp. 729-734, November 1997.

[3] L. Zhao and C. E. Thorpe, “Stereo- and neural network-based pedestrian detection”, IEEE Transactions on Intelligent Transportation Systems, vol. 1, pp.

148-154, September 2000.

[4] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, 2005.

[5] S. Montabone, A. Soto, “Human detection using a mobile platform and novel features derived from a visual saliency mechanism”, Image and Vision Computing, vol. 28, pp. 391-402, 2010.

[6] P. Viola, M. J. Jones, and D. Snow, “Detecting Pedestrians Using Patterns of Motion and Appearance”, International Journal of Computer Vision, vol. 63, pp.

153-161, 2005.

[7] M. Dimitrijevic, V. Lepetit, and P. Fua, “Human body pose detection using Bayesian spatio-temporal templates”, Computer Vision and Image Understanding, vol.104, pp.127-139, 2006.

[8] R. C. Gonzalez, R. E. Woods, Digital Image Processing, Addison-Wesley, New York, 1992.

[9] D. Marr and E. Hildreth, “Theory of edge detection”, Proceedings of the Royal Society, vol. 207, pp. 197–217, London, 1980.

[10] W. Guo, D. L. Bi, L. Liu, “Human motion tracking based on shape analysis”, Proceedings of the International Conference on Wavelet Analysis and Pattern Recognition, pp. 2-4, Beijing, China, November 2007.

[11] T. Law, H. Itoh, and H. Seki, “Image filtering, edge detection, and edge tracing using fuzzy reasoning”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 481-491, May 1996.

[12] O. Williams, A. Blake, and R. Cipolla, “Sparse bayesian learning for efficient visual tracking”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, pp. 1292–1304, August 2005.

[13] A. Agarwal and B. Triggs, “Recovering 3D human pose from monocular images”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, pp.

44-58, January 2006.

[14] B. Han and L. Davis, “Object tracking by adaptive feature extraction”, International Conference on Image Processing, pp. 1501-1504, October 2004.

[15] R. T. Collins, Y. Liu and M. Leordeanu, “Online selection of discriminative tracking features”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, pp.1631-1643, October 2005.

[16] K. Fukunaga and L. D. Hostetler, “The estimation of the gradient of a density function, with applications in pattern recognition”, IEEE Transactions on Information Theory, vol. 21, pp. 32-40, January 1975.

[17] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 564-577, May 2003.

[18] D. Freedman and P. Kisilev, “Fast mean shift by compact density representation”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 1818-1825, June 2009.

[19] F. L. Wang, S. Y. Yu, and J. Yang, “Robust and efficient fragments-based tracking using mean shift”, AEU - International Journal of Electronics and Communications, vol. 64, pp. 614-623, July 2010.

[20] F. Porikli and O. Tuzel, “Multi-kernel object tracking”, IEEE International Conference on Multimedia and Expo, pp. 1234–1237, July 2005.

[21] C. Yang, R. Duraiswami, and L. Davis, “Efficient mean-shift tracking via a new similarity measure”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 176–183, June 2005.

[22] R. V. Babu, P. Pe´rez, and P. Bouthemy, “Robust tracking with motion estimation and local Kernel-based color modeling”, Image and Vision Computing, vol. 25, pp.1205–1216, August. 2007.

[23] S. Feng, Q. Guan, S. Xu and F. Tan, “Human tracking based on mean shift and Kalman Filter”, International Conference on Artificial Intelligence and Computational Intelligence,2009.

[24] P. Pe´rez, C. Hue, J. Vermaak, M. Gangnet, “Color-based probabilistic tracking”, Proceedings of European Conference on Computer Vision, pp. 661-675, 2002.

[25] K. Nummiaro, E. Koller-Meier, and L. V. Gool, “An adaptive color based particle filter”, Image and Vision Computing, vol. 21, pp. 99–110, 2003.

[26] D. Murray and A. Basu, “Motion tracking with an active camera”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, May 1994.

[27] C. W. Lin, C. M. Wang, Y. J. Chang, and Y. C. Chen, “Real-time object extraction and tracking with an active camera using image mosaics”, Proceedings of the IEEE Workshop on Multimedia Signal Processing, pp.

149-152, December 2002.

[28] R. T. Collins, O. Amidi, and T. Kanade, “An active camera system for acquiring multi-view video”, Proceedings of the International Conference on Image Processing, September 2002.

[29] L. Fiore, D. Fehr, R. Bodor, A. Drenner, G. Somasundaram and N.

Papanikolopoulos, “Multi-camera human activity monitoring”, Journal of Intelligent Robotic Systems, vol. 52, pp.5-43, May 2008.

[30] A. R. Smith, “Color Gamut Transform Pairs”, SIGGRAPH 78 Conference Proceedings, vol. 12, pp. 12-19, August 1978.

[31] http://en.wikipedia.org/wiki/HSV_color_space#Conversion_from_RGB_to_HSL _or_HSV

[32] http://www.mathworks.com/access/helpdesk/help/toolbox/images/f8-20792.html [33] Y. Cheng, “Mean shift, mode seeking, and Clustering”, IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 17, pp. 790-799, Aug. 1995.

[34] http://www.commfront.com/RS232_Examples/CCTV/Pelco_D_Pelco_P_Examp les_Tutorial2.HTM#1

[35] http://en.wikipedia.org/wiki/PID_controller

在文檔中應用於主動式攝影機上的權重式重取樣粒子濾波器的人形追蹤 (頁 32-0)