第五章 實驗結果
6.2 未來展望
目前本系統仍有許多可改善的空間:
找尋眼睛和眉毛的位置是尋找上半臉影像的低灰階值,由於頭髮的灰階 值大小和瞳孔幾乎一樣,若是受測者額頭散佈著過多頭髮,會造成瞳孔 位置的誤判,影響以瞳孔之間距離為基準的後續判斷,使得眼睛與眉毛 可能區域會找錯,找出的特徵點位置因此不是預期的位置,計算出的特 徵值也非原先所定義;偵測嘴巴的特徵點只用灰階值強弱作判斷也不夠 穩固,不適用於每個人。若要使特徵擷取更為強健的話,可以加入其他 資訊,如用模型比對人臉器官的定位或是各器官之間位置的幾何比例。
另外,可以增加其他特徵值,特徵值個數的增加或許可以提高辨識率,
例如加入皺紋特徵。另外,在語音特徵的選擇上,可以增加更多的統計 特徵,再由其中找尋具有鑑別性的情緒參數當作情緒辨識的特徵。
2. 在情緒辨識方面
辨識的情緒只能判斷出是屬於何種情緒,但卻難以判斷情緒表達的強度 或是組合的情緒,可以在辨識部分加上模糊邏輯,增加一些可辨識的組 合情緒,如驚訝加上高興的驚喜,或是判斷出情緒表達的強度。或是將 SVM 輸出改為機率輸出,計算受測情緒分別為 5 種情緒的機率,根據 機率值的大小,也可辨識出組合的情緒與情緒表達的強度,應用在機器 人技術上,使得機器人的更為智慧與人性化。
參考文獻
[1] http://cdnet.stpi.org.tw/techroom/policy/policy_05_013.htm
[2] M. Fujita, “On Activating Human Communications with Pet-type Robot AIBO,”
Proc. of the IEEE, vol. 92, no. 11, pp. 1804-1813, 2004.
[3] M. Fujita, Y. Kuroki, T. Ishida and T.T. Doi, “A Small Humanoid Robot SDR-4X for Entertainment Applications,” International Conference on Advanced Intelligent Mechatronics, Kobe, Japan, 2003, pp. 938-943.
[4] http://www.robotdiy.com/article.php?sid=141
[5] http://www.ars-journal.com/ars/Free_Articles/IREX-2005.htm
[6] P.Ekman and W.V. Friesen, The Facial Action Coding System: A Technique for The Measurement of Facial Movement, San Francisco, Consulting Psychologists Press,1978.
[7] J. Song, Z. Chi, J. Liu and H. Fu, “Extraction of Face Image Edges with Application to Expression Analysis,” Proc. of 2004 8th International Conference on Control, Automation, Robotics and Vision, Kunming, China, 2004, pp.
804-809.
[8] J. F. Cohn, A. J. Zlochower, J. J. Lien and T. Kanade, “Feature-Point Tracking by Optical Flow Discriminates Subtle Differences in Facial Expression,” Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 1998, pp. 396-401.
[9] H. Seyedarabi, A. Aghagolzadeh and S. Khanmohammadi, “Recognition of Six Basic Facial Expressions by Feature-Points Tracking using RBF Neural Network and Fuzzy Inference System,” Proc. of the 2004 IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, 2004, pp. 1219-1222.
[10] Ying-li Tian, T. Kanade and J.F. Cohn, “Recognizing Action Units for Facial Expression Analysis,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 97-115, 2001.
[11] G. Donato, M.S. Bartlett, J.C. Hager, P. Ekman and T.J. Sejnowski, “Classifying Facial Actions,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 974-988, 1999.
Methods for Vision-based Analysis of Facial Expressions and Gender,” IEEE International Conference on Systems, Man and Cybernetics, Hague, Netherlands, 2004, vol. 3, pp. 2203-2208.
[13] I. Buciu, C. Kotropoulos and I. Pitas, “ICA and Gabor Representation for Facial Expression Recognition,” Proc. of 2003 International Conference on Image Processing, Barcelona, Spain, 2003, pp. 855-858.
[14] Z. Zhang, M. Lyons, M. Schuster and S. Akamatsu, “Comparison Between Geometry-Based and Gabor-Wavelets-Based. Facial Expression Recognition Using Multi-Layer Perceptron,” Proc. of IEEE International Conference on Automatic Faceand Gesture Recognition, Nara, Japan, 1998, pp 454-459.
[15] Z. Zhang, M. Lyons, M. Schuster and S. Akamatsu, “Comparison Between Geometry-Based and Gabor-Wavelets-Based. Facial Expression Recognition Using Multi-Layer Perceptron,” Proceedings of IEEE International Conference on Automatic Faceand Gesture Recognition, Nara, Japan, 1998, pp 454-459.
[16] I.O. Stathopoulou and G.A. Tsihrintzis, “An Improved Neural-network-based Face Detection and Facial Expression Classification System,” Proc. of IEEE International Conference on System, Man and Cybernetics, Hague, Netherlands, 2004, pp. 666-671.
[17] D.N. Jiang and L.H. Cai, “Speech Emotion Classification with the Combination of Statistic Features and Temporal Features”, IEEE International Conference on Multimedia and Expo , Taipei, Taiwan, 2004, pp. 1967-1970.
[18] B. Schuller, G. Rigoll and M. Lang, “Hidden Markov Model-based Speech Emotion Recognition”, Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, 2003, vol. 2, pp 1-4.
[19] D. Ververidis, C. Kotropoulos and I. Pitas ,“Automatic Emotional Speech Classification,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Quebec, Canada, 2004, vol. 1, pp 593-596.
[20] B. Schuller, G. Rigoll and M. Lang, “Speech Emotion Recognition Combining Acoustic Features and Linguistic Information in a Hybrid Support Vector Machine - Belief Network Architecture”, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Quebec, Canada, 2004, vol. 1, pp. 577-580.
[21] X.H. Le, G. Quénot and E. Castelli, “Recognizing Emotions for Audio-Visual
Document Indexing," Proceedings of 9th Symposium on Computers and Communications, Alexandria, Egypt, 2004, pp. 580-584.
[22] T.L. Pao, Y.T. Chen and J.H. Yeh, “Emotion Recognition from Madarin Speech Signals,” Proceedings of IEEE International Symposium on Chinese Spoken Language Processing, Hong Kong, pp. 301-304, 2004.
[23] L.C. De Silva, T. Miyasato and R. Nakatsu, ”Facial Emotion Recognition Using Multi-modal Information,” Proceeding of IEEE International Conference on Information, Communications and Signal Processing, Singapore, 1997, pp.
397-401.
[24] L.S Chen, T.S. Huang, T. Miyasato and R. Nakatsu, "Multimodal Human Emotion /Expression Recognition," Proceeding of International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 1998, pp. 366-371.
[25] L.C. De Silva, “Audiovisual Emotion Recognition,” IEEE International Conference on Systems, Man and Cybernetics, Hague, Netherlands, 2004, pp.
649-654.
[26] H.J. Go, K.C. Kwak, D.J. Lee and M.G. Chun, "Emotion recognition from the facial image and speech signal," SICE Annual Conference, vol. 3, pp. 2890-2895, 2003.
[27] M. Song, J. Bu, C. Chen and N. Li, “Audio-Visual Based Emotion Recognition - A New Approach,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1020-1025, 2004.
[28] Y. Wang and L. Guan, “Recognizing Human Emotion from Audiovisual Information,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1125-1128, 2005.
[29] 吳鑑峰, 應用語音及臉表情之雙模態情緒辨識, 碩士論文, 國立成功大學資 訊工程學系, 2002.
[30] 顏坤銘, 家用機器人之語音辨識系統, 碩士論文, 國立交通大學電機與控制 工程學系, 2001.
[31] Jyh-Shing Roger Jang, “Audio Signal Processing and Recognition,” (in Chinese) available at the links for on-line courses at the author's homepage at
http://www.cs.nthu.edu.tw/~jang.
工程學系, 2005.
[33] P. Viola and M. Jones, “Rapid Object Dectetion Using a Boosted Cascade of Simple Features,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2001.
[34] 吳明衛, 自動化臉部表情分析系統, 碩士論文, 國立成功大學資訊工程學系, 2003.
[35] J.H. Lai, P.C Yuen, W.S. Chen, S. Lao and M. Kawade, ”Robust Facial Feature Point Detection Under Nonlinear Illuminations,” Proceedings of IEEE ICCV Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, Canada, 2001, pp.168-174.
[36] 王景南, 多類支向機之研究, 碩士論文, 元智大學資訊管理學系, 2003.
[37] N. Christianini and J.S. Taylor, An Introduction to Support Vector Machines, Cambridge, 2000.
[38] J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning, Springer, 2001.
[39] TMS320C6416 DSK Technical Reference, April 2003, Texas Instrument. Inc.
[40] TMS320C6000, Reference Guide, “TMSC6000 Peripherals Reference Guide,
“ Literature Number: SPRU190D, December 2002, Texas Instrument. Inc.
[41] TMS320C6000, Reference Guide, “TMSC6000 Multichannel Buffered Serial Port (McBSP),” Literature Number: SPRU580D, September 2004, Texas Instrument Inc.