• 沒有找到結果。

According to the analysis above,we make a comparison for the descriptors we menttioned before.We can say the most commenly used descriptor is the curvature based descriptors such as shape index [15, 16, 17, 18, 19, 20],HK classi…cation [9, 10, 11, 12, 21, 23, 25]

and principle curvature [7].In [14],Ceron compared the priciple curvature,mean curva-ture,Gaussian curvature,shape index and curvedness to determine which descriptors is

the most representative descriptors.He designed two kinds of tests : the …rst one tried to determine which descriptor is the most represenrative descriptor over all the points and the second one is composed by several test depending on the facial region.The experiment results showed that the best shape descriptor points in the human face is the shape index in any test.

Except the curvature based descriptors,E¤ective Energy and Distance to Local Plane both use the relatonship between the feature point and the points in its neighborhood to identify the feature points.Because of the weak representativeness,they only can roughly identify peaklike shape,valleylike shape or planar shape.Unlike EE and DLP,spin image is a strong local descriptor.It transform the 3D coordinate into 2D cylindrical coordi-nate system.The cylindrical representation makes this local descriptor be robust to the rotation.This property let spin image have stronger descriptiveness than other descrip-tors,however,its computation cost is expensive.This shortage is an urgent problem needed to solve immediately for spin image.If we want to obtain the structural information of hu-man face,facial pro…le is a good descriptor.The facial pro…le is important to the description of the nose feature.It is comprehensively applied on the nose region extraction.But it is very sensitive to rotation caused by its structural information.In the following chapter,we will introduce many people used di¤erent algorithms or combined di¤erent descriptors to

…x up the descriptors’ shortcomings and make their extraction system more robust.

Chapter 3

Expression Invariant Facial Feature Extraction

In chapter 2,we introduced some shape descriptors of 3D surfaces. Before we try to identify the feature points from the complex data,the local shape contributes corresponding to each point on a facial surface should be computed beforehand. By understanding the priori knowledge of the human face,we can establish the discriminative criterions for each feature point on human face.The most prominent feature point is clearly the nose. The nose sticks out from the whole face, and has a roof-like shape. Moreover, the nose falls approximately in the center of the whole face. These properties make the nose tip quite easy to be extracted. Apart from the nose tip,the eye corners are also easy to identify features on human face. We can say that the eye corners has a valley-like shape in the local shape aspect. It is easy to extract the eye conners by using the curvature analysis or other local shape descriptor analysis or by identify the colors around the shape. Also, because of the invariance to the human expression,the nose tip and the eye corners (including the inner eye corners and the outer eye corners) are the most frequently extracted facial feature points. Extracting other facial feature points, such as the chin, the nose bridge,

the mouth corners, and the mouth lips, are not as easy as the salient of the nose tip and the eye corners, although they are still the facial features we are interest in. These facial features are also a¤ected easily by the varying expression and the changing of head pose.

Hence the extraction of facial feature points except the nose tip and the eye corners needs the methods which are more robust to the variation of expression of the face and the pose of the head.

People have proposed many kinds of methods to extract the facial points.For example the most direct methods are only based on single shape descriptors such as Gaussian curvature and mean curvature,shape index,spin image or facial pro…le.And if the shape descriptor has weak representation on facial feature points,people will combine other de-scriptors to enhance the representation or use a heuristic framework or a cascaded …ltering to extract the facial feature points much precisely.More than that,some methods based on combining the information of 2D and 3D have been proposed and vari…ed the result of the methods based on the combination of 2D and 3D are better then the methods based on respectively in experiments.It is too hard to categorize these many methods in a system-atic way.Therefore we can re-think those methods in the opposite direction of previous view.We return to the facial scan itself.When acquiring the facial scan,we can …nd the facial scans are not always be the same because of the di¤erent head poses and di¤erent facial expression.Therefore,We can roughly categorize the facial scan into three group:the frontal facial scan,the facial scan with variantion of head pose and the facial scan with variantion of facial expression.It is no doubt that it is more easiear to localization the facial feature points on a frontal facial scan than on others.The missing facial data caused by the head pose and the variance of characteristics of feature points caused by the facial expression both a¤ect the result of the localization.Hence we will illustrate the methods of extraction in detail in the three directions and give a comparison of the methods in the

end of this chapter.

3.1 Extraction of Facial Feature Points on a static Frontal Facial Scan

In this section,we start to introduce how to extract the facial feature points.If we limit the test facial scan must be a frontal facial scan,it means we will get a su¢cient information of the test facial scan.Many methods have been proposed to extract the facial feature points on a frontal facial scan.Most of the existing methods apply the priori knowledge of human face to the algorithm.and make use of facial geometry-based analysis to localize geometri-cally salient feature points such as nose tip,eye corners,chin tip,mouth corner,etc.And we can use the shape descriptors to describe the geometrical characteristics for each facial feature point and apply the corresponding methods to extract the facial feature points.

Many existing works use the curvature based descriptors to label their shape.HK

claasi-…cation is a commen descriptor in curvature based analysis.in [12],Cheng found the inner eye corner region at …rst,and then located the nose region and the nose bridge.He used HK classi…cation labeled the surface type.However he found that if only use the HK classi…ca-tion,there are too many pit regions to …nd the corresponding inner eye corners.Therefore he removed the small pit region which involve a number of points lower than the threshold value at …rst.Then a pair of regions that has similar average value in both Y and Z can be regarded as inner eye corners.In the following,he search the region between two inner eye corners for the peak region.That is exactly the nose region.This heuristic algorithm create a good result of extracting the three feature points.Although HK classi…cation is rotation invariant.But Cheng used the frontal position to design the algorithm to localize the nose tip and inner eye corners.This makes the algorithm is sensetive to the head pose and only

useful in a frontal facial scan.In [9],we can see that Colombo also used HK classi…cation for locating three facial feature points:nose tip and inner eye corners.For improving the accuracy,Colombo used some …lters to reduce the search region and then regarded the region with the highest curvature value are the nose tip and the inner eye corners.He in-dicated that other regions,like the mouth,or cheeks,or forehead,do not present particular or simple curvature characteristics that allow robust autimatic detection.

In addition to HK classi…cation,shape index is used to label the surface type very often in curvature based analysis.Lu and Jain [18] proposed the method to locate the position of eye and mouth corners,and nose tip,based on a fusion scheme of shape index on a range image and the cornerness response on an intensity image.They also developed a heuristic method based on cross pro…le analysis to locate the nose tip more robustly on a frontal scan.By searching points with the maximum z value at each row,a column with the maximum number of points with the maximum z value is regarded as a mid-line on a facial scan as show in Fig.16.The vertical z pro…le along the mid-line can be shown as in Fig17.Lu and Jain assumed that the nose tip is close to the mid-line and the nose bridge presents a strong consecutive increase in z value.

Figure 16:Finding face mid-line. (a) The yellow marks represent the positions where the z value reaches the extremum along each row. (b) Total number of extreme z values (yellow points) in each column. (c)

The mid-line (in blue) is located by choosing the column with the maximum peak in (b) [18].

Figure17:The depth (Z) pro…le along the mid-line [14].

Then they use these two properties to …nd the nose tip.After they found the nose tip,they use the nose tip to …t a stastical model of the facial features to be a prior constaint to reduce the search area for the feature points.The stastical model not only greatly reduces the cost of computation,but also enhences the accuracy of extraction

re-sults.Finally,they use the min-max rule to normalize the shape index,, and cornerness,,respectively.If the normalized shape index is denoted by 0(),the 0() can be computed at point  as

0() = ()¡ minfg

maxfg ¡ minfg (3.1)

where fg is the set of shape index value for feature point in seach region.The cornerness

’s normalization is the same.At last,the …lnal score  () is computed by integrating scores from two modalities using the sum rule

 () = (1¡ 0()) + 0() (3.2)

The point with the highest  () in each search region is identi…ed as the corresponding feature point.From the previous process,extraction of eye corners and mouth tip can be accomplished.Lu and Jain tested their methods by 98 frontal scan with natural expression and 98 frontal scan with smile expression.Th experiment results showed that the eaxtrac-tion of nose tip by the heuristic mthods is good on a frontal scan and Lu also indicated that the combining information of 2D and 3D will improve the accuracy respectively rel-ative to 2D and 3D.But we …nd that the experiment data are all with natural or smile expression without great expression.We can not know whether their method is robust to the expression.And Lu and Jain made use of the integration of shape index from a range image and cornerness from an intesity image.Since the cornerness represent the properties in 2D,it may be in‡uenced by the illumination or the head pose.

Segundo [30] also proposed a face and facial feature detection method by making use of facial pro…le and combing 2D face segmetation on deoth images with surface curvature in-formation.He wanted to localize the eye corners,nose tip,nose base,and nose corners.Fig.18 illustrates the algorithm in a detail.Segundo used the HK classi…cation to isolate the region he was interested in at …rst.Then he computed two y-projections of the depth information which is named pro…le curve and median curve for …nding the nose tip y-coordinate.These curves are obtained by determining the maximum depth value,namely pro…lle curve,and the median depth value,namely median curve,of every set of points with the same y-coordinate from the face image.We can see the curves in Fig.19.And the y-y-coordinate with the maximum di¤erence value is the nose tip’s y-coordinate.

Figure18:Diagram of Segundo’s landmark detection approach.

Figure19:Example of Segundo’s nose tip y -coordinate detection [30].

For other feature points’ y-coordinate,Segundo followed the same procedure but com-puted on a curvature image.Hence we can get a y-porjection pro…le and this projection

pro…le presents three peaks,the eyes,nose base,and mouth.Under the nose tip y-coordinate is already known,Segundo used the relative position of nose tip and other feature points to …nd other feature points’ y-coordinate.Therefore,he indicated that the closest peak to the nose tip as being the nose base y-coordinate,the upside peak is de…ned as the eye cor-ners’ y-coordinate and the underneath peak represents the mouth y-coordinate as shown in Fig.19.The next step is to …nd the x-coordinate of all the feature points.The similar process is applied in this step but all works are around the x-projection pro…le.Segundo

…rst …nd the nose tip’s x-coordinate and use this information to …nd the nose base.To

…nd the x-coordinates of eye corners,Segundo computed the x-projection of the curva-ture image by calculate the percentage of pit curvacurva-ture points for every column in a set of neighbor rows centered in the eye y-coordinate;Segundo tested the whole system by totaling 2500 images of 100 subjects with high facial expression variations.The result of experiment shows a high accuracy on the frontal sacn and strongly robust to the facial expression.In [30],Segundo also compared the experiment result with Lu and Jain’s [18]

methods and shows the better performance than others.But the system do not show the robustness to the head pose.

Figure19:Example of facial landmark detection on y -coordinate: Eyes, nose base, and mouth.[30]

Without regard to the curvature based descriptors,Xu used e¤erctive energy to de-scribe the local distibution of neighboring points and make a hierarchical …ltering scheme based on two rules,rule 1 states that a point is the nose tip candidate only if all the components in the e¤ective energy set of the point are negative and rule 2 states that a point is the nose tip candidate only if the mean of the point’s neighbors’ EE is neg-atively smaller and the variance is positively larger than that of any other facial area [28].Fig.20 illustrates the process of detecting the nose tip.Although e¤ective energy has a weak representative for each point,Xu use a hierarchical …ltering scheme to make up the insu¢cient descriptiveness.He used three di¤erent databases to test his proposed algo-rithm.These three di¤erent databases include the facial scan with pose variation and the facial scan with expression variation.The result of experiment reach the correct detection rate up to 99.3%.Although the system is robust to the pose and expression variations,but it is certain only for nose tip.We do not sure about that the system still can perform well on detecting other feature points.

Figure20:The process for detecting the nose tip [28]..

Cond [19] presented a 3D facial feature location method based on the spin image regis-tration technique.First,he wanted to isolate the candidate areas which contain the feature points.Cond assumed that the area of interest have a higher curvature.Hence he calculated the mean curvature at each point.Then he can isolate the candidate areas containing the facial feature points successfully.Once the candidate areas have been foud,the spin imge is computed for each points in the candidate areas.Since each point creates di¤erent spin images,Cond applied a support vector machine (SVM) classi…cation to compare these spin images for each point.The method was tested on a database of 51 subjects.The scans in the database include some scans with small pose variation.The small pose variation do not a¤ect the experiment result.The test results reach 99.66% on frontal scans.In[29],Cond indicated that although the spin image can be a powerful tool to represent the facial fea-tures,it requires a great computation e¤ort.How to reduce the great computation cost is an important problem that we need to solve to optimize the system.

In the previous,we have dicussed some commen methods corresponding to the shape feature descriptors included HK classi…cation,shape index,pro…le curve etc.Although the proposed methods all have good localization on frontal scan.It can not guaratee that if the facial scan with some variation whether the result of extraction of facial feature points is still good as the result in the natural facial scan.We not only introduce the concept of each method but also comprehensively discuss the possible problems in each method.Most of the problems we concern about are the e¤ect of the head pose variation and the e¤ect of the facial expression.Those problems will be discussed in the following two sections.

相關文件