Recognition over a series of actions - Experimental Results and discussions

Chapter 4 Experimental Results and discussions

4.2 Recognition over a series of actions

In the experiment, human take a series of different actions, and the system will automatic recognize the action type in each frame. 3 different action series video clips are used to test the proposed system. We compare the recognition result to human-made ground truth to evaluate the system performance.

The first test sequence is “Sit up – get up – Jump 2 – turn about – Walk – turn about – Crawl 1”. The second test sequence is “Sidewalk – turn about – Walk – turn about – Pick up”. The third test sequence is “Crawl 2 – get up – turn about – Walk – turn about – Jump2”. Each sequence contains about 3-4 defined action types and 1-2 undefined action types (transitional action). Figure 4-4, 4-5, 4-6 (a) shows the original image sequence (some selected frames) of the four action series respectively.

The proposed system recognized the action type by the sliding window scheme.

Figure 4-4, 4-5, 4-6 (b) shows the recognition result. The x-coordinate of the graph is the frame number, and the y-coordinate indicates the recognized action. The red line is the ground truth defined by human observation, and the blue line is the recognized action types. The unknown period is the time human performs actions that are not defined in the ten categories. The first period of unknown of ground truth is get up, and the second and third period are turn about. The unknown period of recognition result is due to the history of postures is not enough (smaller than the window size).

By these graphs, we can see that the time human perform the defined actions can be correctly recognized. Some misunderstanding can be corrected by smoothing the recognition signal. A small recognition time delay occurs at the start of crawl due to not enough history for the sliding window scheme. However, the delay is very small that human can hardly feel. The time period human perform undefined action, the system choose the most possible action from ten defined actions. Therefore, more different actions must be added to enhance the system.

(a) Some original image sequences of ‘sit up – jump2 – walk – crawl1’

(b) Recognition result

Figure 4-4 Recognition over a series of actions ‘sit up – jump2 – walk – crawl1’

(a) Some original image sequences of ‘sidewalk – walk – pickup’

(b) Recognition result

Figure 4-5 Recognition over a series of actions ‘sidewalk – walk – pickup’

(a) Some original image sequences of ‘crawl 2 – walk – jump 2’

(b) Recognition result

Figure 4-6 Recognition over a series of actions ‘crawl 2 – walk – jump 2’

Chapter 5 Conclusion and Future Work

We have presented an efficient mechanism for human action recognition based on the shape information of the postures which are represented by star skeleton. We clearly define the extracted skeleton as a five-dimensional vector so that it can be used as recognition feature. A feature distance (star distance) is defined so that feature vectors can be mapped into symbols by Vector Quantization. Action recognition is achieved by HMM. The system is able to recognize ten different actions. For single action recognition, 98% recognition rate was achieved. The recognition accuracy could still be improved with intensive training. For recognition over a series of actions, the time human perform the defined ten actions can be correctly recognized.

Although we have achieved human action recognition with high recognition rate, we also confirm some restrictions of the proposed technique from the experimental results. One limitation is that the recognition is greatly affected by the extracted human silhouette. We used a uniform background to make the foreground segmentation easy in our experiments. To build a robust system, a strong mechanism of extracting correct foreground object contour must be developed. Second, the representative postures in the codebook during Vector Quantization are picked manually, clustering algorithms can be used so that they can be extracted automatically for a more convenient system. Third, the viewing direction is somewhat fixed. In real world, the view direction varied for different locations of the cameras.

The proposed method should be improved because the human shape and extracted skeleton would change from different views.

Bibliography

[1] J.K. Aggarwal and Q. Cai. “Human motion analysis: A review,” Computer Vision Image Understanding, Vol.73, No.3, pp.428–440, March 1999.

[2] D.M. Gavrila. “The visual analysis of human movement: A survey,” Computer Vision Image Understanding, Vol.73, No.1, pp.82–98, Jan. 1999.

[3] D. Hogg. “Model-based vision: A program to see a walking person,” Image Vision Computing, Vol.1, No.1, pp.5–20, Feb. 1983.

[4] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland. “Pfinder: Real-time tracking of the human body,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.19, No.7, pp.780–785, July 1997.

[5] A. Agarwal. and B. Triggs. "Recovering 3D Human Pose from Monocular Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.

44-58, 2006.

[6] H. Murase and S.K. Nayar. “Visual learning and recognition of 3-D objects from appearance,” International Journal on Computer Vision, Vol.14, No.1, pp.5–24, Jan. 1995.

[7] H. Murase and R. Sakai. “Moving object recognition in eigenspace representation:

Gait analysis and lip reading,” Pattern Recognition Letter, pp.155–162, Feb. 1996.

[8] T. Ogata, J. K. Tan and S. Ishikawa. "High-Speed Human Motion Recognition Based on a Motion History Image and an Eigenspace," IEICE Transactions on Information and Systems, pp. 281-289, 2006.

[9] L. Wang, T. Tan, H. Ning and W. Hu. "Silhouette Analysis-Based Gait Recognition for Human Identification," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1505-1518, 2003.

[10] R. Bodor, B. Jackson, O. Masoud and N. Papanikolopoulos. "Image-Based Reconstruction for View-Independent Human Motion Recognition," Proceedings of International Conference on Intelligent Robots and Systems, Vol.2, pp.

1548-1553, 2003.

[11] R. Cucchiara, C. Grana, A. Prati and R. Vezzani. "Probabilistic Posture Classification for Human-Behavior Analysis." IEEE Transactions on Systems, Man and Cybernetics, Vol.35, pp. 42-54, 2005.

[12] N. Jin and F. Mokhtarian. "Human Motion Recognition Based on Statistical Shape Analysis," Proceedings of IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 4-9, 2005.

[13] M. Blank, L. Gorelick, E. Shechtman, M. Irani and R. Basri. "Actions as Space-Time Shapes," Tenth IEEE International Conference on Computer Vision, Vol. 2, pp. 1395-1402, 2005.

[14] H. Yu, G.M. Sun, W.X. Song and X. Li. "Human Motion Recognition Based on Neural Network," Proceedings of International Conference on Communications, Circuits and Systems, pp. 982-985, 2005.

[15] C. Schuldt, I. Laptev and B. Caputo. "Recognizing Human Actions: A Local SVM Approach," Proceedings of the 17th International Conference on Pattern Recognition, Vol.3, pp. 32-36, 2004.

[16] H. Su and F.G. Huang. "Human Gait Recognition Based on Motion Analysis,"

Proceedings of International Conference on Machine Learning and Cybernetics, Vol. 7, pp. 4464-4468, 2005.

[17] J. Yamato, J. Ohya and K. Ishii. "Recognizing Human Action in Time-Sequential Images using Hidden Markov Model," Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 379-385, 1992.

[18] A. Kale, A. Sundaresan, A. N. Rajagopalan, N. P. Cuntoor, A. K.

Roy-Chowdhury, V. Kruger and R. Chellappa. "Identification of Humans using Gait," IEEE Transactions on Image Processing, pp. 1163-1173, 2004.

[19] R. Zhang, C. Vogler and D. Metaxas. "Human Gait Recognition," Proceedings of International Workshop on Computer Vision and Pattern Recognition, 2004.

[20] L. H. W. Aloysius, G. Dong, Z. Huang and T. Tan. "Human Posture Recognition in Video Sequence using Pseudo 2-D Hidden Markov Models," Proceedings of International Conference on Control, Automation, Robotics and Vision Conference, Vol. 1, pp. 712-716, 2004.

[21] M. Leo, T. D'Orazio, I. Gnoni, P. Spagnolo and A. Distante. "Complex Human Activity Recognition for Monitoring Wide Outdoor Environments," Proceedings of the 17th International Conference on Pattern Recognition, Vol.4, pp. 913-916, 2004.

[22] F, Niu and M. Abdel-Mottaleb. "View-Invariant Human Activity Recognition Based on Shape and Motion Features," Proceedings of IEEE Sixth International Symposium on Multimedia Software Engineering, pp. 546-556, 2004.

[23] T. Mori, Y. Segawa, M. Shimosaka and T. Sato. "Hierarchical Recognition of Daily Human Actions Based on Continuous Hidden Markov Models,"

Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 779-784, 2004.

[24] X. Feng and P. Perona. "Human Action Recognition by Sequence of Movelet Codewords," Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission, pp. 717-721, 2002.

[25] L. R. Rabiner. "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proceedings of the IEEE, pp. 257-286, 1989.

[26] H. Fujiyoshi and A. J. Lipton. "Real-Time Human Motion Analysis by Image Skeletonization." Proceedings of the Fourth IEEE Workshop on Applications of Computer Vision, pp. 15-21, 1998.

[27] X. D. Huang, Y. Ariki, and M. A. Jack. "Hidden Markov Models for Speech Recognition". Edingurgh Univ. Press, 1990.

在文檔中使用星狀骨架作人類動作自動辨識 (頁 39-0)