Efficient Face/Pose Detection Based on Machine Learning
全文
(2) ⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭. 0.23≦S≦0.68, and 0≦H≦50. values as the threshold for the interval t of the frontal face classifier of feature j. By this way, a look up table can be set up, i.e. hj(v1, v2, v3) is determined for every feature j. To determine whether a sample x is a frontal face according to the feature j, referring to the threshold of the interval containing fj(x) for frontal face classifier on the bin hj of the look up table, an output of v2 in hj will be 1 if fj(x) is belonging to the same half with positive training samples and 0 otherwise. Similarly for determining whether a sample x is a left or right face according to the feature j. Although the size of the look up table is large, n x 3 x k, but it is set up during the training stage, when determining the status of a test sample it only needs to refer to the table which is very time efficient.. (2). z. Given n training images( xi, yi ) where i = 1, …, n, and yi∈{0,1} for negative and positive training samples respectively.. z. or Initialize the weights ω1,i = for yi = 2m 2l 0,1 respectively, where m and l are the number of negative and positive training samples.. z. For t = 1…T:. (3). 2.2: ADABOOST ALGORITHM A brief introduction of Adaboost algorithm [11][12] is given in the Fig. 1. When the strong classifier H composed of T features (weak classifiers) has been trained by the provided training samples, it will evaluate H(x) for every candidate x. If H(x) = 1 then x is classified as a positive and a negative otherwise. The confidence value for candidate x, CH(x), is defined as in Eq. (4) which is the value evaluated in H(x) indicating how similar the features in H with those in x. The confidence value CH(x) =. ∑. T. y. y. y. 1. For each feature j , a weak classifier hj is trained using ωt ,i . Next, calculate the error:. ε j = ∑i ω t , i h j (xi ) − yi. α t ht ( x ). (4) The proposed system is not only able to detect whether there is a face, but also determine its pose (a left, frontal, or right face). Adaboost algorithm is modified to implement 3- dimension vector yi such that (1, 0, 0) (or (0, 1, 0), (0, 0, 0)) for xi to be a left face (or frontal-face, non-face), etc.. The details of training each weak classifier hj for feature j is explained in the following. Assuming there are n features (weak classifiers), a look up tables with n bins will be built such that each feature bin has 3 classifiers, hj(v1, v2, v3), v1, v2, v3 ∈ {1, 0} representing left & non-left, frontal & non-frontal and right & non-right face, j = 0, …, n-1. To train a classifier, for example, the frontal face classifier for feature j, i.e., to determine the threshold of v2, we evaluate fj(x) for every training sample x where fj is the function for feature j. Instead of having only one hard threshold as usually do in training weak classifiers of Adaboost, we adopt the method in [19] to have a better judgment. First, values of fj(x) for all x are normalized and evenly divided into k intervals. For those values fj(x) fall in the interval t, t = 0, 1,…, k-1, find the average of fj(x) from positive samples (i.e., x is a frontal face) and negative samples (i.e., x is not a frontal face) respectively. Next, take the midpoint of the two average t =1. 1. let hk (⋅) to ht (⋅) if for. .. ∀j ≠ k , ε k < ε j ,. i.e. to choose the weak classifier ht(⋅) with the lowest error). Let ε t = ε k . ε Update ω t + 1, i = ω t , i β t1− e i and β t = t 1− εt where ei = 0, 1 for training sample xi being correctly or incorrectly classified by ht(⋅).. y. z. Normalize ω t +1 so that it is a weight distribution.. The final strong classifier is: 1 T T ⎧⎪ 1 ∑t =1αt ht (x) ≥ 2 ∑t =1αt H (x) = ⎨ ⎪⎩0 otherwise 1 where α t = log . βt Fig. 1. The AdaBoost Algorithm. 3: THE PROPOSED SYSTEM The proposed system utilizes two types of machine learning algorithms to detect skin and face with pose respectively. Initially a hierarchical neural network is applied for skin detection. Begin with a neural network to overcome the diversity of light and follow by a second neural network to make a distinct for colors near the skin color. After skin areas are detected, some. - 1082 -.
(3) morphological operation and simple connected component analysis are applied to eliminate possible noises. Finally, every connected component of skin area will be fed into the trained Adaboost algorithm for face & pose detection.. 3.1: THE DETECTION OF SKIN AREAS In the propose system, only the detected skin area will be further processed for face detection. Due to hard thresholds of skin color as in Eq. (1), (2), (3), some skin color pixels may be sacrificed and this causes difficulties in later steps. A hierarchical neural network is thus designed to achieve the best of both tasks, i.e., preserve the skin area and eliminate non-skin pixels. The influence of light variation on colors is one of the main reasons that makes skin color detection a challenging task. To overcome this problem, the first neural network is trained separately according to the strength of luminance Y (Y>128 or Y≦128).Due to the nature of connectedness of skin pixels, the neural network takes cross shape features on YCbCr color space as shown in Fig. 2. For any pixel, together with 8 other pixels as indicated ( 2 on its top, bottom, left, and right), each with Y, Cb, Cr 3 values, 27 values are the input for the neural network.. much as possible. The method of bootstrap on training samples is applied to promote the performance of the AdaBoost. Testing images are downloaded from websites as seen in Fig. 4, 5, 6 in Section 4. The Haar-like features, as shown in Fig. 3, and the variances of first three Haar-like features are adopted for features in Adaboost. To determine whether a skin area containing any faces, a sliding window of 20 x 20 is applied on every connected component of skin area, from left to right and top to bottom. If the center portion (10 x 10) of a sliding window contains less than 85 skin pixels, then this sliding window will be skipped and go to next window. These sliding windows are fed into the AdaBoost algorithm one by one for determination. To detect all sizes of faces, the process is repeated by a scale of 0.8 on the image until its height or width is less than 40. Since every skin area will be examined repeatedly on different scales, it is very possible that a face is detected more than once. The confidence value in Eq. (4) will be used as the criterion. When there are overlapping windows with positive response of the same type (left, frontal, or right face), reserve the window with largest confidence value and eliminate the rest.. Fig. 3. The Haar-like features. YCbCr YCbCr YCbCr. YCbCr. YCbCr. 4: THE EXPERIMENTAL RESULTS YCbCr. YCbCr. YCbCr YCbCr. Fig.2. The cross feature taken for first neural network Observing the candidate skin areas output from the first neural network, although all skin pixels are preserved, there are a few similar color non-skin pixels are kept as well. Therefore, the output of the first neural network will be processed again in the second neural network. The goal of the second neural network is to eliminate those non-skin pixels but have color similar to skin color. It takes features R, G, B values on RGB color space as input.. 3.2: THE DETECTION OF FACES AND POSES After skin areas being located, morphological opening and closing are applied for eliminating noises. Also a skin area will be discarded if the proportion of width and height of the connected component of a skin area is larger than 4 or smaller than 1/4, or any of height or width is less than 20 pixels. The training samples for AdaBoost are 20 x 20 images of left, frontal, right, and non-face taken from websites and the CVL face database [20]. These training images are manicured to cover only facial features as. Some experimental results are shown and discussed here. For skin detection, as shown in Fig. 4, the original image (a) is affected by green color on the lower portion and the light on the face is uneven too. Our method Fig. 4 (e), has the best result among all. As notice, the method of YCbCr [6], Eq. (2), also performs well compared with Fig. 4 (b), (c). In fact, [6] in general shows satisfying results and it is referenced by other research frequently when skin color detection problem is discussed. Thus, in Fig. 5 & 6, only [6] will be compared with our result. In Fig. 5, testing on different races, observing (b) & (c), both methods can extract most of skin areas. Our method preserves skin areas more, for example, the forehead of the lady on the right, with the price that some non-skin pixels are kept as well, as the left shoulder of the lady on the right. Same consequence is derived on Fig. 6. Due to the similarity to skin color of colors on background and the lady’s hair, our method, Fig. 6(b), preserves not only correct skin area but hair and background too. As the [6]’s method, Fig. 6(c), wrongly identifies blond hair as skin too but it successfully eliminates the background with the price that it also eliminates facial skin area. As a preprocess of face detection, Fig. 6(c) has no face area kept at all which consequently results in no face detected. Therefore, our method is more suitable for later face detection.. - 1083 -.
(4) 13.0 13.2. (a) (b) (c) (d) (e) Fig.4. Results of skin detection with (a) the original image, and by methods of (b) HSV [18], (c) RGB [3], (d) YCbCr [6], (e) ours.. (a). (b). (c). 11.5 12.9. 13.1. (a). (d) Fig.7. Results of face detection where red, blue, green boxes are for left, frontal, right faces respectively. (b). (c). Fig.5. Results of skin detection on different races with (a) the original image, and by methods of (b) ours, (c) YCbCr [6].. (a). (b). (c). The difficulty of skin detection and face/pose detection lies on unconstrained background and diversity of the target. By machine learning to find subtle distinctions among positive and negative samples is a promising resort and the success of machine learning largely depends on training samples. Thus how to choose enough and good training samples is an interesting problem. In the future, we will focus on finding better training samples and possibility of integrating the system with other learning methods, such as SVM and PCA.. REFERENCES. Fig.6. Results of skin detection with (a) the original image, and by methods of (b) ours, (c)YCbCr [6].. As face/pose detection, Fig. 7 shows some of our experimental results. The red, blue, green boxes are for detected left, frontal, right faces respectively. These images, except (c), are natural images with all kinds of background setting. Our method in general shows satisfying results. In (c) and (d), there are multiple boxes with confidence values indicated, the one with the largest value will be the representative box which also is the correct face area.. 5: CONCLUSION In this paper, we use a hierarchical neural network for skin detection. Begin with a neural network to overcome the diversity of light and follow by a second neural network to get over colors near the skin color. After the skin area is detected, an AdaBoost learning algorithm is implemented for face/pose detection. Experimental results show that the proposed method achieves a good performance in skin color detection and face/pose detection, capacity of coping with the problems of scaling, rotation and multiple faces.. [1] Son Lam Phung, Abdesselam Bouzerdoum, Douglas Chai,“Skin Segmentation Using Color Pixel Classification: Analysis and Comparison,” IEEE Trans.on pattern analysis and machine intelligence, Vol. 27, No. 1, January 2005. [2] J. Yang, A. Waibel, “Tracking human faces in real time,” CMU-CS-95-210, 1995. [3] Franc Solina, Peter Peer, Borut Batagelj, Samo Juvan, Jure Kova c, “Color-based face detection in the 15 seconds of fame art installation,” Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 2003. [4] Kah Phooi seng, Andy suwandy, L. M ang, “Improved automatic face detection technique in color images,” IEEE, 2004. [5] Yanjiang Wang, Baozong Yuan, “A novel approach for human face detection from color images under complex background,” Pattern Recognition 34 (2001) pp. 1983 – 1992. [6] Linhui Jia and L. Kitchen, “Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis,” IEEE Transactions on Image Processing, Volume 9, Issue 1, Jan. 2000, pp. 80 – 87. [7] Li-hong Zhao, Xiao-Lin Sun, Ji-Hing Liu, Xin-He Xu, “Face Detection Based On Skin Color,” Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, August 2004. [8] H.Wang and S-F Chang, “A highly efficient system for automatic face region detection in MPEG video,” IEEE. - 1084 -.
(5) Trans. Circuits Syst. Video Tech. vol.7 no.4, pp. 615 – 628, 1997. [9] Min Jiang, GuiMin He, ZhaoHui Gan, “Extending active shape models with color information for facial features localization,” IEEE Int. Workshop VLSI Design & Video Tech. Suzhou, China, May 2005. [10] Yusuke Nara, Jianming Yang, Yoshikazu Suematsu, “Face Detection Using the Shape of Face with Both Color and Edge,” Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems, Singapore, December, 2004. [11] Paul Viola, Michael Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, Dec. 2001. [12] Paul Viola, Michael Jones, “Robust Real Time Object Detection,” IEEE ICCV Workshop Statistical and Computational Theories of Vision, July 2001. [13] Taigun Lee, Sung-Kee Park, Mignon Park, “A New Facial Features and Face Detection Method for Human-Robot Interaction,” Proceedings of the 2005 IEEE International Conference on Robotics and Automation Barcelona, Spain, April 2005. [14] El Sayed M.Saad, Mohiy M.Hadhoud Moawad I.Moawad, Mohamed El Halawany, Alaa M. Abbas, “Detection of faces in a color natural scene using skin color classification and template matching,” 22th National Radio Science Conference March 15-17, 2005, Cairo, Egypt. [15] Bardia Mohabbati, Shohrch Kasaci, “An Efficient Wavelet/Neural Networks-Based Face Detection Algorithm,” IEEE, 2005. [16] Peng Wang, Qiang Ji, “Learning Discriminant Features for Multi-View Face and Eye Detection,” Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Reconnition ( CVPR 2005). [17] Peichung Shih and Chengjun Liu,“Face Detection Using Distribution-based Distance and Support Vector Machine,” Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2005). [18] Son Lam Phung, Abdesselam Bouzerdoum and Douglas Chai, “Skin Segmentation Using Color Pixel Classification: Analysis and Comparison,” IEEE Transactions on Pattern Analysis And Machine Intellegence, Vol. 27, No. 1, January 2005. [19] Chang Huang, Haizhou AI1, Yuan LI1and Shihong Lao, “Vector Boosting for Rotation Invariant Multi-View Face Detection,” Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05). [20] the CVL face database, http://lrv.fri.uni-lj.si/index.html. - 1085 -.
(6)
數據
相關文件
Then, a visualization is proposed to explain how the convergent behaviors are influenced by two descent directions in merit function approach.. Based on the geometric properties
In this paper, we have studied a neural network approach for solving general nonlinear convex programs with second-order cone constraints.. The proposed neural network is based on
In Section 4, we give an overview on how to express task-based specifications in conceptual graphs, and how to model the university timetabling by using TBCG.. We also discuss
Secondly, the key frame and several visual features (soil and grass color percentage, object number, motion vector, skin detection, player’s location) for each shot are extracted and
To solve this problem, this study proposed a novel neural network model, Ecological Succession Neural Network (ESNN), which is inspired by the concept of ecological succession
A digital color image which contains guide-tile and non-guide-tile areas is used as the input of the proposed system.. In RGB model, color images are very sensitive
Soille, “Watershed in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations,” IEEE Transactions on Pattern Analysis and Machine Intelligence,
The methodology involved in the study is based on the theory of innovation adoption, including the fact proposed by Holak (1988) that product attributes, consumer characteris- tics