Face localization based on wavelet extrema

(1)

Face Localization in Cluttered Background

Jing-Wein Wang

Institute of Photonics and Communications

National Kaohsiung University of Applied Sciences

E-mail:jwwang@cc.kuas.edu.tw

Abstract

The human face localization task upon upright vertical frontal views faces in complex scenes is formulated as a wavelet-based problem and developed a novel approach using the extrema density aims to determine a single face position. Face-of-interest (FOI) region is firstly located and framed by finding facial edges using the inter-orientation wavelet subbands and the anthropometric measure. Then, a refining step using the proposed head contour detection (HCD) approach is carried to further improve the localization accuracy. Comparisons with existing state-of-the-art face detection works, showing that our system has a comparable performance in terms of face localization rate and quality.

Keywords: Face localization, Extrema density, Face-of-interest (FOI), Head contour detection (HCD).

1. Introduction

Face localization, which is a simplified detection problem and aims to determine the image position and size of a single face, is a fundamental stage in the process of face recognition. The accuracy of the detected face coordinate has a heavy influence on the recognition performance since most techniques (e.g. eigenfaces [1]) assume the face image normalized in terms of scale and rotation, their performance depends heavily upon the accuracy of the detected face position within the image. This makes face detection a crucial step in the process of face recognition. Recently, a sizable body of research in the area of face detection has been amassed. An excellent survey of the relevant literature can be found in [2]. A major technical challenge that needs to be addressed in various directions is the unsatisfactory performance of face detectors in rather unconstrained environments.

Recent works have proposed the use of wavelet functions as activation functions and have shown

their powers in face detection problems [3]. Although wavelet decompositions can map the useful information content into a lower dimensional feature space, however, with the selected basis what feature is an efficient representation and how to develop a computationally efficient face localizing algorithm still deserve further study. In the current paper, based on dyadic discrete wavelet transforms (DWT) [4] we propose an efficient localization method to extract 2-D wavelet extrema density as feature from the three octave-width subbands decomposed. Then, a gradient-based boundary search algorithm is in turn used to find a coarse boundary of FOI. Finally, a refining step is carried out to readjust the previous located bounding box by using the head contour detection. The remainder of this paper is organized as follows. In Section 2, the characterization and the extraction of wavelet extrema density for facial edge detection is given. Section 3 describes the proposed face localization and head contour refining method. Experimental results and comparisons of the proposed scheme with the existing works are provided in Section 4. Conclusions are drawn in Section 5.

2. Facial Feature Extraction

Texture measure, which offers a means of detecting objects in background clutter that has similar spectral characteristics, is the visual cue due to the difference between human face and background [5]. To describe facial texture, one obvious feature is roughness. Since a face may exhibit different roughness over the decomposed wavelet subbands, it is proper in reality to detect face region by investigating the utility of feature derived from wavelet transform extrema. Roughness corresponds to the perception that our sense of touch will feel with an object and it can be characterized in two-dimensional scans by depth (wavelet coefficient strength) and width (separation between wavelet extrema). This interpretation prompted us to estimate the selected extrema as a particular signature of roughness, being very useful as a distinctive feature of face texture measures. The properties of these extrema were studied in [6], and

International Conference on Intelligent Information Hiding and Multimedia Signal Processing

(2)

abovementioned images can be detected by our algorithm. Considering both our method and the Crete face detector, over-framed FOI cannot be totally avoided. In terms of speed, our system is faster, operating at an average processing time 1.6 ~ 2.1 sec per BioID image on a 1.0 GHz Pentium III PC. On the other hand, the Crete face detector processed at an average of 2.0 ~ 4.7 second per image on the same test data but on a different platform.

As presented in Fig. 4 for the Visionics data set, on the other hand, our algorithm detect 109 of the 112 faces which means a successful rate of 97.3%, whereas the Crete detector detects 98 faces of the 100 faces, leading to a successful detection rate of 98%. We observed that our detector failed mainly for faces of too dark. The main reason is that due to the dim lighting, which hides a significant part of the face, the number of extrema number in the LH subband is too few to detect the horizontal face boundary.

5. Conclusions and Future Work

The presented framework led to fine face localization results, which did not involve sophisticated methods, is suitable for the application such as video telephony requiring the low-delay and limited computing power. Meanwhile, the author also noticed the recent literature appeared in [13]. A further comparison will be conducted with the related works in the next project.

Acknowledgement

The financial support provided by the NSC 96-2221-E-151-051 is gratefully acknowledged.

References

[1] J. Zhang, Y. Yan, and M. Lades, “Face recognition: eigenface, elastic matching, and neural nets,” Proc. IEEE, 85:1423-1435, 1997.

[2] M. H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting faces in images: a survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, pp. 34-58, 2002.

[3] C. Garcia and G. Tziritas, “Face detection using quantized skin color regions merging and wavelet packet analysis,” IEEE Trans. Multimedia, vol. 1, pp. 264-277, 1999.

[4] S. G. Mallat, “Multifrequency channel decomposition of images and wavelet models,” IEEE Trans. Acous. Speech Signal Process., 37:2091-2110, 1999.

[5] I. Craw, N. Costen, T. Kato, and S. Akamatsu, “How should we represent faces for automatic recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, pp. 725-736, 1999.

[6] S. G. Mallat and W. L. Hwang, “Singularity detection and processing with wavelets,” IEEE Trans. Inform. Theo., 38:617-643, 1992.

[7] I. Daubechies, Orthonormal bases of compactly supported wavelets, Comm. on Pure and Applied Math., 91:909-996, 1988.

[8] http://www.humanscan.de/support/downloads/ facedb.php

[9] http://www.identix.com/

[10] K. J. Kirchberg, O. Jesorsky, and R. W. Frischholz, “Genetic model optimization for Hausdorff distance-based face localization,” International ECCV 2002 Workshop on Biometric Authentication, Copenhagen, Denmark, LNCS-2359, pp. 103-111, 2002.

[11] J. Wu and Z.-H. Zhou, “Efficient face candidates selector for face detection,” Pattern Recognition, vol. 36, pp. 1175-1186, 2003.

[12] C. Garcia and M. Delakis, “Convolutional face finder: a neural architecture for fast and robust face detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11, pp. 1408-1423, 2004.

[13] S. Phimoltares, C. Lursinsap, anf K. Chamnongthai, “Face detection and facial feature localization without considering the appearance of image context,” Image and Vision Computing, v.25 n.5, p.741-753, May, 2007.