There are two main future directions for further improvements. The first one is how to estimate the torso motion robustly, and the second one is how to prevent body parts from
CHAPTER 6. CONCLUSIONS AND FUTURE WORKS
interfering with each other when evaluating their likelihood functions for possible states.
We show possible improvements in the following paragraphs for these two directions individually.
For torso motion tracking, we have provided a torso prediction mechanism to increase reliability. But the poor observations such as silhouette/voxel noises may still cause the estimation to be unstable. We list below possible improvements for torso tracking:
• Build an online appearance model for a more reliable likelihood function that con-siders not only shape information but also appearance information. This is espe-cially useful when the target subject wears clothes with conspicuous features.
• Utilize the information of the head position and orientation. The face detection is robust such that the face position and orientation can be used for torso prediction.
• Many advanced particle filtering algorithms (referred to in Section 2.3.2) can be adopted for more effective and reliable tracking results.
For limbs motion tracking, we proposed a 1-DOF particle filtering with soft-joint constrained ICP. The performance is mainly determined by the correspondence matching stage of ICP. The following contains two possible improvements for limb tracking:
• The voxel labeling method based on the previous pose is fast but primitive. It is possible to utilize appearance and motion information to improve voxel labeling accuracies.
• In addition to the soft-joint constraint, we can also regularize ICP with other human anthropometric constraints to avoid rare or impossible human poses.
Bibliography
[1] A. Agarwal and B. Triggs. Tracking articulated motion using a mixture of autore-gressive models. In European Conference on Computer Vision, 2004.
[2] A. Agarwal and B. Triggs. Recovering 3d human pose from monocular images.
Transactions on Pattern Analysis and Machine Intelligence, 28(1):44–58, 2006.
ISSN 0162-8828. doi: 10.1109/TPAMI.2006.21.
[3] K. S. Arun, T. S. Huang, and S. D. Blostein. Least-squares fitting of two 3-d point sets. Transactions on Pattern Analysis and Machine Intelligence, 9(5):698–700, 1987.
[4] M. Bray, E. Koller-Meier, and L. V. Gool. Smart particle filtering for high-dimensional tracking. Computer Vision and Image Understanding, 106(1):116–129, 2007.
[5] C. Bregler and J. Malik. Tracking people with twists and exponential maps. In J. Malik, editor, Conference on Computer Vision and Pattern Recognition, pages 8–15, 1998. doi: 10.1109/CVPR.1998.698581.
[6] A. O. B˘alan and M. J. Black. An adaptive appearance model approach for model-based articulated object tracking. In M. Black, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages 758–765, 2006. doi: 10.1109/
CVPR.2006.52.
[7] W.-Y. Chang, C.-S. Chen, and Y.-P. Hung. Appearance-guided particle filtering for articulated hand tracking. In C.-S. Chen, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages 235–242 vol. 1, 2005. doi: 10.1109/CVPR.
2005.72.
[8] G. K. Cheung, T. Kanade, J.-Y. Bouguet, and M. Holler. A real time system for robust 3d voxel reconstruction of human motions. In T. Kanade, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages 714–720 vol.2, 2000.
doi: 10.1109/CVPR.2000.854944.
BIBLIOGRAPHY
[9] G. K. M. Cheung, S. Baker, and T. Kanade. Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In Conference on Computer Vision and Pattern Recognition, volume 1, pages I–77–I–
84 vol.1, 2003. doi: 10.1109/CVPR.2003.1211340.
[10] Q. Delamarre and O. Faugeras. 3d articulated models and multi-view tracking with silhouettes. In O. Faugeras, editor, International Conference on Computer Vision, volume 2, pages 716–721 vol.2, 1999. doi: 10.1109/ICCV.1999.790292.
[11] J. Deutscher, B. North, B. Bascle, and A. Blake. Tracking through singularities and discontinuities by random sampling. In International Conference on Computer Vision, volume 2, pages 1144–1149 vol.2, 1999. doi: 10.1109/ICCV.1999.790409.
[12] J. Deutscher, A. Blake, and I. Reid. Articulated body motion capture by annealed particle filtering. In A. Blake, editor, Conference on Computer Vision and Pat-tern Recognition, volume 2, pages 126–133 vol.2, 2000. doi: 10.1109/CVPR.2000.
854758.
[13] J. Deutscher, A. Davison, and I. Reid. Automatic partitioning of high dimensional search spaces associated with articulated body motion capture. In A. Davison, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages II–669–
II–676 vol.2, 2001. doi: 10.1109/CVPR.2001.991028.
[14] P. F. Felzenszwalb and D. P. Huttenlocher. Pictorial structures for object recognition.
International Journal of Computer Vision, 61(1):55–79, 2005.
[15] M. Fontmarty, F. Lerasle, and P. Danès. Data fusion within a modified annealed particle filter dedicated to human motion capture. In F. Lerasle, editor, International Conference on Intelligent Robots and Systems, pages 3391–3396, 2007. doi: 10.
1109/IROS.2007.4399521.
[16] T. X. Han, H. Ning, and T. S. Huang. Efficient nonparametric belief propagation with application to articulated body tracking. In Conference on Computer Vision and Pattern Recognition, volume 1, pages 214–221, 2006. doi: 10.1109/CVPR.
2006.108.
[17] J.-M. Hasenfratz, M. Lapierre, and F. Sillion. A real-time system for full body inter-action with virtual worlds. In Eurographics Symposium on Virtual Environments, pages 147–156, 2004. URL http://artis.imag.fr/Publications/
2004/HLS04.
[18] S. Hou, A. Galata, F. Caillette, N. Thacker, and P. Bromiley. Real-time body track-ing ustrack-ing a gaussian process latent variable model. In A. Galata, editor, Interna-tional Conference on Computer Vision, pages 1–8, 2007. doi: 10.1109/ICCV.2007.
4408946.
BIBLIOGRAPHY
[19] W. Hu, T. Tan, L. Wang, and S. Maybank. A survey on visual surveillance of object motion and behaviors. 34(3):334–352, 2004. ISSN 1094-6977. doi: 10.1109/
TSMCC.2004.829274.
[20] G. Hua, M.-H. Yang, and Y. Wu. Learning to estimate human pose with data driven belief propagation. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 747–754 vol. 2, 2005. doi: 10.1109/CVPR.2005.208.
[21] S. Ioffe and D. A. Forsyth. Probabilistic methods for finding people. International Journal of Computer Vision, 43(1):45–68, 2001.
[22] M. Isard and A. Blake. Icondensation: Unifying low-level and high-level tracking in a stochastic framework. In European Conference on Computer Vision, pages 1–16, 1998.
[23] M. Isard and A. Blake. Condensation – conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1):5–28, 1998.
[24] I. Kakadiaris and D. Metaxas. Model-based estimation of 3d human motion. Trans-actions on Pattern Analysis and Machine Intelligence, 22(12):1453–1459, 2000.
ISSN 0162-8828. doi: 10.1109/34.895978.
[25] R. Kehl, M. Bray, and L. V. Gool. Full body tracking from multiple views using stochastic sampling. In M. Bray, editor, Conference on Computer Vision and Pattern Recognition, volume 2, pages 129–136 vol. 2, 2005. doi: 10.1109/CVPR.2005.165.
[26] K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis. Real-time foreground-background segmentation using codebook model. Real-Time Imaging, 11(3):172–
185, 2005.
[27] M. W. Lee and I. Cohen. Proposal maps driven mcmc for estimating human body pose in static images. In Conference on Computer Vision and Pattern Recognition, volume 2, pages II–334–II–341 Vol.2, 2004. doi: 10.1109/CVPR.2004.1315183.
[28] R. Li, M.-H. Yang, S. Sclaroff, and T.-P. Tian. Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers. In European Conference on Computer Vision, 2006.
[29] J. MacCormick and M. Isard. Partitioned sampling, articulated objects, and interface-quality hand tracking. In European Conference on Computer Vision, 2000.
[30] W. Matusik, C. Buehler, and L. McMillan. Polyhedral visual hulls for real-time rendering. In Proceedings of the 12th Eurographics Workshop on Rendering Tech-niques, pages 115–126. Springer-Verlag, 2001.
[31] B. Michoud, E. Guillou, and S. Bouakaz. Shape from silhouette: Towards a solution for partial visibility problem. In Eurographics Short Papers Preceedings, 2006.
BIBLIOGRAPHY
[32] B. Michoud, E. Guillou, H. Briceño, and S. Bouakaz. Real-time marker-free motion capture from multiple cameras. In E. Guillou, editor, International Conference on Computer Vision, pages 1–7, 2007. doi: 10.1109/ICCV.2007.4408991.
[33] I. Miki`c, M. Trivedi, E. Hunter, and P. Cosman. Human body model acquisition and tracking using voxel data. International Journal of Computer Vision, 53(3):
199–223, 2003.
[34] T. B. Moeslund and E. Granum. A survey of computer vision-based human motion capture. Computer Vision and Image Understanding, 81(3):231–268, 2001.
[35] T. B. Moeslund, A. Hilton, and V. Krüger. A survey of advances in vision-based human. motion capture and analysis. Computer Vision and Image Understanding, 104(2):90–126, 2006.
[36] G. Mori and J. Malik. Recovering 3d human body configurations using shape con-texts. Transactions on Pattern Analysis and Machine Intelligence, 28(7):1052–1062, 2006. ISSN 0162-8828. doi: 10.1109/TPAMI.2006.149.
[37] G. Mori, X. Ren, A. A. Efros, and J. Malik. Recovering human body configurations:
combining segmentation and recognition. In Conference on Computer Vision and Pattern Recognition, volume 2, pages II–326–II–333 Vol.2, 2004. doi: 10.1109/
CVPR.2004.1315182.
[38] L. Mündermann, S. Corazza, and T. P. Andriacchi. Accurately measuring human movement using articulated icp with soft-joint constraints and a repository of artic-ulated models. In S. Corazza, editor, Conference on Computer Vision and Pattern Recognition, pages 1–6, 2007. doi: 10.1109/CVPR.2007.383302.
[39] R. Poppe. Vision-based human motion analysis: An overview. Computer Vision and Image Understanding, 108:4–18, 2007.
[40] D. Ramanan, D. A. Forsyth, and A. Zisserman. Tracking people by learning their appearance. Transactions on Pattern Analysis and Machine Intelligence, 29(1):65–
81, 2007. ISSN 0162-8828. doi: 10.1109/TPAMI.2007.250600.
[41] L. Raskin, M. Rudzsky, and E. Rivlin. Tracking and classifying of human motions with gaussian process annealed particle filter. In Asian Conference on Computer Vision, 2007.
[42] X. Ren, A. C. Berg, and J. Malik. Recovering human body configurations using pairwise constraints between parts. In International Conference on Computer Vi-sion, volume 1, pages 824–831 Vol. 1, 2005. doi: 10.1109/ICCV.2005.204.
[43] R. Ronfard, C. Schmid, and B. Triggs. Learning to parse pictures of people. In European Conference on Computer Vision, 2002.
BIBLIOGRAPHY
[44] S. Rusinkiewicz and M. Levoy. Efficient variants of the icp algorithm. In M. Levoy, editor, Proc. Third International Conference on 3-D Digital Imaging and Modeling, pages 145–152, 2001. doi: 10.1109/IM.2001.924423.
[45] H. Sidenbladh, M. J. Black, and D. J. Fleet. Stochastic tracking of 3d human figures using 2d image motion. In European Conference on Computer Vision, 2000.
[46] L. Sigal, S. Bhatia, S. Roth, M. J. Black, and M. Isard. Tracking loose-limbed people. In Conference on Computer Vision and Pattern Recognition, volume 1, pages I–421–I–428 Vol.1, 2004. doi: 10.1109/CVPR.2004.1315063.
[47] C. Sminchisescu and B. Triggs. Covariance scaled sampling for monocular 3d body tracking. In B. Triggs, editor, Conference on Computer Vision and Pattern Recogni-tion, volume 1, pages I–447–I–454 vol.1, 2001. doi: 10.1109/CVPR.2001.990509.
[48] C. Sminchisescu and B. Triggs. Estimating articulated human motion with covari-ance scaled sampling. International Journal of Robotics Research, 22:371–393, 2003.
[49] C. Stauffer and W. Grimson. Adaptive background mixture models for real-time tracking. In Conference on Computer Vision and Pattern Recognition, volume 2, pages –252 Vol. 2, 1999. doi: 10.1109/CVPR.1999.784637.
[50] J. Sun, W. Zhang, X. Tang, and H.-Y. Shum. Background cut. In European Confer-ence on Computer Vision, 2006.
[51] R. Urtasun, D. J. Fleet, and P. Fua. Monocular 3d tracking of the golf swing. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 932–938 vol. 2, 2005. doi: 10.1109/CVPR.2005.229.
[52] R. Urtasun, D. J. Fleet, A. Hertzmann, and P. Fua. Priors for people tracking from small training sets. In D. Fleet, editor, International Conference on Computer Vision, volume 1, pages 403–410 Vol. 1, 2005. doi: 10.1109/ICCV.2005.193.
[53] L. Wang, W. Hu, and T. Tan. Recent developments in human motion analysis. Pat-tern Recognition, 36(3):585–601, 2003.
[54] P. Wang and J. M. Rehg. A modular approach to the analysis and evaluation of particle filters for figure tracking. In J. Rehg, editor, Conference on Computer Vision and Pattern Recognition, volume 1, pages 790–797, 2006. doi: 10.1109/CVPR.
2006.32.
[55] X. Xu and B. Li. Learning motion correlation for tracking articulated human body with a rao-blackwellised particle filter. In B. Li, editor, International Conference on Computer Vision, pages 1–8, 2007. doi: 10.1109/ICCV.2007.4408951.
[56] M. Yamamoto, A. Sato, S. Kawada, T. Kondo, and Y. Osaki. Incremental tracking of human actions from multiple views. In Conference on Computer Vision and Pattern Recognition, pages 2–7, 1998. doi: 10.1109/CVPR.1998.698580.
BIBLIOGRAPHY
[57] J. Zhang. Statistical modeling and localization of nonrigid and articulated shapes.
Technical report, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, March 2006.
[58] J. Zhang, R. Collins, and Y. Liu. Representation and matching of articulated shapes.
In Conference on Computer Vision and Pattern Recognition, volume 2, pages II–
342–II–349 Vol.2, 2004. doi: 10.1109/CVPR.2004.1315184.
[59] J. Zhang, J. Luo, R. Collins, and Y. Liu. Body localization in still images using hier-archical models and hybrid search. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 1536–1543, 2006. doi: 10.1109/CVPR.2006.72.