Face Segmentation Algorithm …

CHAPTER 2 FACE SEGMENTATION

2.2 Face Segmentation Algorithm …

The algorithm in [3] is an unsupervised segmentation algorithm, and hence no manual adjustment of any design parameter is needed in order to suit any particular input image. The only principal assumption is that the person’s face must be present in the given image, since we are locating and not detecting whether there is a face.

The revised algorithm we used is consists of four stages, as depicted in Fig. 2.1.

Fig. 2.1. Outline of face-segmentation algorithm.

A. Color Segmentation

The first stage of the algorithm is to classify the pixels of the input image to skin region and non-skin region. To do this, we reference a skin-color reference map in

r b

YC C color space.It has been proved that a skin-color region can be identified by the presence of a certain set of chrominance values (i.e., C and _r C ) narrowly and _b consistently distributed in the YC C color space. We utilize Rr b Cr and RCb to represent the respective ranges of C and _r C values that correspond to skin color, _b which subsequently define our skin-color reference map. The ranges that the paper uses to be the most suitable for all the input images that they have tested are

Input Image: An Image Including A Face

RC_r= [133, 173], and RC_b= [77, 127].

The size of image we use is 640×480. In the cause of reducing the computing time, we downsample the image to become 320×240 and recover in the last stage.

Therefore, for an image of M×N pixels and we downsample it to M/2×N/2. With the skin-color reference map, we got the color segmentation result OA as

OA(x, y) = 1, if [Cr(x, y)∈R_C_r]T the picture respectively. An example to illustrate the classification of the original image Fig. 2.2 is shown in Fig. 2.3.

Nevertheless, the result of color segmentation is the detection of pixels in a facial area and may also include other areas where the chrominance values coincide with those of the skin color (as is the case in Fig. 2.3). Hence the successive operating stages of the algorithm can be exploited to eliminate these misdiagnosed areas.

Fig. 2.2. Original image.

Fig. 2.3. Image after filtered by skin-color map in stage A.

B. Density Regularization

This stage considers the bitmap produced by the previous stages to contain the facial region that is corrupted by noise. The noise may appear as small holes on the facial region due to undetected facial features such as eyes, mouth, even glasses, or it may also appear as objects with skin-color appearance in the background scene.

Therefore, this stage pre-performs simple morphological operations such as dilation to fill in any small hole in the facial area and erosion to remove any small object in the background area. Nevertheless, the intention is not to remove the noise entirely but to reduce its amount and size.

To distinguish between facial and non-facial region more complete, we first need to identify regions of the bitmap that have higher probability of being the facial region.

According to their observation in [3], it shows that the facial color is very uniform, and therefore the skin-color pixels belonging to the facial region will appear in a single large cluster, while the skin-color pixels belonging to the background may appear as many large clusters or small isolated objects. Thus, we study the density distribution of the skin-color pixels detected in stage A. A density map is calculus as follows.

It first partitions the output bitmap of stage A OA(x, y) into non-overlapping groups of 4×4 pixels, then counts the number of skin-color pixels within each group and assigns this value to the corresponding point of the density map.

According to the density value, we classify each point into three types, namely, zero (D = 0), intermediate (0 < D < 16), and full (D = 16). A group of points with zero density value will represent a non-facial region, while a group of full density points will signify a cluster of skin-color pixels and a high probability of belonging to a facial region. Any point of intermediate density value will indicate the presence of noise. The density map of an example with three density classifications is depicted in Fig. 2.4. The point of zero density is shown in white, intermediate density in green,

and full density in black.

Once the density map is derived, we can then begin the process that we termed as density regularization. This involves the following three steps.

1) Discard all points at the edge of the density map, i.e., set D(0, y) = D(M/8–1, y)

= D(x, 0) = D(x, N/8–1) for all x = 0, …, M/8–1 and y = 0, …, N/8–1.

2) Erode any full-density point (i.e., set to zero) if it is surrounded by less than five other full-density points in its local 3×3 neighborhood.

3) Dilate any point of either zero or intermediate density (i.e., set to 16) if there are more than two full-density points in its local 3×3 neighborhood.

After this process, the density map is converted to the output bitmap of stage B as

The result of the previous example is displayed in Fig. 2.5.

Fig. 2.4. Density map after classified to three classes.

Fig. 2.5. Image produced by stage B.

C. Geometric Correction

In this stage, we first performed two simple procedures that are similar to that initially introduced in stage B to ensure that noise appearing on the facial region is filled in and that isolated noise objects on the background are removed. The two procedures are shown as followings, a pixel in O_B(x, y) with the values of one will remain as a detected pixel if there are more than three other pixels, in its local 3×3 neighborhood, with the same value. At the same time, a pixel in O_B(x, y) with a value of zero will be reconverted to a value of one (i.e., as a potential pixel of the facial region) if it is surrounded by more than five pixels, in its local 3×3 neighborhood, with a value of one.

We then commence the horizontal scanning process on the “filtered” bitmap. We search for any short continuous run of pixels that are assigned with the value of one.

Any group of less than four horizontally connected pixels with the value of one will be eliminated and assigned to zero. A similar process is then performed in the vertical direction. As a result the output bitmap of this stage should contain the facial region with minimal or no noise, as demonstrated in Fig. 2.6.

Fig. 2.6. Image produced by stage C.

D. Contour Extraction

In this final stage, we convert the M/8 × N/8 output bitmap of stage C back to the dimension of M/2 × N/2. To achieve the increase in spatial resolution, we utilize the edge information that is already made available by the color segmentation in stage A. Therefore, all the boundary points in the previous bitmap will be mapped into the corresponding group of 4 × 4 pixels with the value of each pixel as defined in the output bitmap of stage A. The representative output bitmap of this final stage of the algorithm is shown in Fig. 2.7.

Fig. 2.7. Image produced by stage D.

Chapter 3. Eye Detection, Glasses Existence Detection

在文檔中自動眼睛偵測及眼鏡反光消除 (頁 20-28)