Metric Determination - Basis-emphasized Non-negative Matrix Factorization with Wavelet Transfo

given by

hi = W^†fi (2.11)

where W^†is a computed pseudo-inverse matrix of the basis matrix W . Once trained, the facial image set is represented by a set of encodings {h¹h2. . . h_m} with a reduced dimension of rank r.

2.6 Metric Determination

Since the positive space learned by NMF and its extensions lacks of a suit-able metric, it is not directly adequate for further analysis such as object recognition using the nearest neighbor classifier. For this reason, a distance metric must be determined to work with the positive projected vectors in an optimal manner. In order to improve the face recognition accuracy, we evaluate various distance metrics for the learned feature vectors trying to determine the best one for the specific problem.

We assume that x and y are two d-dimensional vectors. The aim is to calculate different distance metrics between a test feature vector and a prototype one, and how close theses vectors are to each other. For this, six commonly used distance measures are tested in this current work:

(1) The L1, Manhattan or Cityblock metric is defined as:

L1(x, y) =

i=1

|xⁱ− yⁱ| (2.12)

(2) The Euclidean or L2 metric is defined as:

Euc(x, y) = v u u t

X(xi− yⁱ)² (2.13)

2.6. METRIC DETERMINATION

(3) The Correlation metric is defined as:

Cor(x, y) =

i=1(xi− x^mean)(yi− y^mean) q

i=1(xi− x^mean)²Pd

i=1(yi− y^mean)²

(2.14)

(4) The Angular metric is defined as:

Ang(x, y) =

Pd i=1xiyi

q Pd

i=1x²_i Pd i=1y_i²

(2.15)

which is the cosine of the angle between the two observation vectors measured from zero and takes values from -1 to 1.

(5) The Mahalanobis metric is defined as:

M ah(x, y) =p(x − y)^T × Cov(D)⁻¹× (x − y) (2.16)

Cov(D) is the covariance matrix.

(6) The Riemannian metric is defined as:

Rie(x, y) = (x − y)^TG(x − y) (2.17)

G is a similarity matrix which is defined as G = (Gij) = (hBⁱ, Bji) = B^TB where Bi is the i-th learned basis, and hx, yi means the inner product of x and y.

We have selected the AR color face database because it is a well-known database with a large number of color facial images under various conditions.

There are six metrics that have been tested with such a database and most of them are based on preprocessing the input images in order to reduce some distortion effects. In this case, we use the 120 × 120 grayscale images of the

2.6. METRIC DETERMINATION

method-rank metric

L1 Euc Cor Ang M ah Rie

NMF-25 0.500 0.540 0.540 0.545 0.430 0.685 LNMF-25 0.670 0.620 0.480 0.515 0.265 0.560 NMFSC-25 0.520 0.580 0.485 0.555 0.205 0.525 BNMF-25 0.505 0.490 0.485 0.510 0.340 0.690

Table 2.2: Recognition rates with NMF, LNMF, NMFSC and BNMF tech-niques using six different distance metrics. The number of basis components, also called rank, is 25.

method-rank metric

L1 Euc Cor Ang M ah Rie

NMF-75 0.620 0.575 0.645 0.655 0.620 0.730 LNMF-75 0.605 0.525 0.440 0.495 0.340 0.640 NMFSC-75 0.700 0.645 0.565 0.640 0.465 0.690 BNMF-75 0.630 0.590 0.635 0.665 0.615 0.745

Table 2.3: Recognition rates with NMF, LNMF, NMFSC and BNMF tech-niques using six different distance metrics. The number of basis components, also called rank, is 75.

cropped AR database without any other modification. Training images con-sist of two neutral poses of each individual in both sessions. Images labeled as F1 are used as a testing set because they contain smile expression under normal condition. The following part presents experimental evaluations of several traditional distance measures in the context of face recognition when using NMF and extensions of NMF.

Our current work compares the performance obtained by NMF and its extensions. Furthermore, we have also analyzed how different distance met-rics affect the classification result in the projected space. Faces are projected

2.6. METRIC DETERMINATION

method-rank metric

L1 Euc Cor Ang M ah Rie

NMF-125 0.630 0.570 0.685 0.685 0.680 0.735 LNMF-125 0.635 0.490 0.455 0.445 0.365 0.690 NMFSC-125 0.710 0.630 0.660 0.680 0.615 0.735 BNMF-125 0.590 0.575 0.680 0.680 0.665 0.750

Table 2.4: Recognition rates with NMF, LNMF, NMFSC and BNMF tech-niques using six different distance metrics. The number of basis components, also called rank, is 125.

method-rank metric

L1 Euc Cor Ang M ah Rie

NMF-175 0.655 0.630 0.705 0.730 0.720 0.735 LNMF-175 0.590 0.455 0.450 0.425 0.350 0.735 NMFSC-175 0.715 0.640 0.720 0.695 0.715 0.745 BNMF-175 0.645 0.585 0.755 0.760 0.745 0.770

Table 2.5: Recognition rates with NMF, LNMF, NMFSC and BNMF tech-niques using six different distance metrics. The number of basis components, also called rank, is 175.

the modification of feature space dimensionality. Each specific situation is analyzed when all NMF-related algorithms are used for classification. Thus, we can learn when the performance of the BNMF technique is better and understand when it can be used for a further classification task.

It is possible to select a suitable metric when using BNMF that we can improve the NMF recognition accuracy with the same training set. Because BNMF adds more constraints on bases to minimize its objective function, the learned bases can present non-negligible correlations and other higher order effects. Therefore, we prefer a distance metric with emphasized con-sideration of the learned bases and our experimental result just agrees with

2.6. METRIC DETERMINATION

1 2 3 4 5 6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recognition Rate

L1 Euc Cor Ang Mah Rie NMF−25

LNMF−25 NMFSC−25 BNMF−25

Figure 2.6: Recognition rates versus the six different metrics with rank = 25.

the above idea. In Table 2.2, Table 2.3, Table 2.4 and Table 2.5, we notice that the best recognition rate is produced by BNMF using the Riemannian metric presents the best one among the six different metrics. Moreover, un-der various analyzed ranks, the recognition rate produced by BNMF using the Riemannian metric generally outperforms than that produced by other NMF-related method using the same metric. Especially under the highest rank of 175, BNMF presents the best recognition rate of 0.77. This is due to the fact that BNMF learns more essential information about the faces. When trying to recover the original image, the above fact is helpful for BNMF to generate a good estimation.

The first impression is that the Euclidean distance is not the most suit-able metric when working with BNMF ,but the Riemannian distance metric

2.6. METRIC DETERMINATION

1 2 3 4 5 6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L1 Euc Cor Ang Mah Rie

Recognition Rate

NMF−75 LNMF−75 NMFSC−75 BNMF−75

Figure 2.7: Recognition rates versus the six different metrics with rank = 75.

1 2 3 4 5 6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L1 Euc Cor Ang Mah Rie

Recognition Rate

NMF−125 LNMF−125 NMFSC−125 BNMF−125

Figure 2.8: Recognition rates versus the six different metrics with rank = 125.

2.6. METRIC DETERMINATION

1 2 3 4 5 6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L1 Euc Cor Ang Mah Rie

Recognition Rate

NMF−175 LNMF−175 NMFSC−175 BNMF−175

Figure 2.9: Recognition rates versus the six different metrics with rank = 175.

20 40 60 80 100 120 140 160 180

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Rank

Recognition Rate

NMF LNMF NMFSC BNMF

Figure 2.10: Recognition rates using the Riemannian distance metric with rank = 25, 75, 125 and 175 respectively.

2.6. METRIC DETERMINATION

techniques. Now we can show that adopting this Riemannian distance metric is more suitable than any other distance metric for face classification when using the nearest neighbor classifier. Let f¹, f² denote two facial vectors in the original n-dimension space, and the corresponding learned coefficients in lower r-dimension space are h¹, h², respectively. To some extent, we can say f1 = W h¹ and f² = W h² where W is the learned basis matrix. Then we can get

Rie(f¹, f2) = (f¹− f²)^T(f¹− f²) = (h¹− h²)^TW^TW(h¹− h²)

= (h¹− h²)^TG(h¹ − h²) 6= (h¹− h²)^T(h¹− h²)

(2.18)

This indicates that the Riemannian metric can preserve the neighborhood of the original samples for classification. The recognition accuracy for AR database using the Riemannian metric is represented in Fig. 2.10, as a func-tion of the number of basis components. All these techniques are able to im-prove recognition accuracy when a higher rank is used. Note that the highest recognition rate is achieved by BNMF with 175 number of basis components.

At last, we can claim that the Riemannian metric is the most suitable one for the problem of metric determination. BNMF clearly outperforms NMF and its other extensions using the Riemannian metric for each number of basis components.

Chapter 3 Wavelet Transform

3.1 Introduction

The Wavelet Transform (WT) has been evolving for some time. Mathemati-cians theorized its use in the early 1900’s. While the Fourier transform deals with transforming the time domain components to frequency domain and fre-quency analysis, the wavelet transform deals with scale analysis, that is, by creating mathematical structures that provide varying time/frequency/amplitude slices for analysis. This transform is a portion of a complete waveform, hence the term wavelet. The wavelet transform has the ability to identify frequency components simultaneously with their location in time. Additionally, com-putations are directly proportional to the length of the input signal.

In wavelet analysis, the scale that one uses in looking at data plays a

在文檔中 Basis-emphasized Non-negative Matrix Factorization with Wavelet Transform (頁 35-45)